Open-source data sharing is helping researchers understand COVID-19

Researchers are using open-source science tools to battle the coronavirus outbreak

The coronavirus pandemic is on everyone’s minds. Meanwhile, researchers around the globe are working day and night to find ways to contain and hopefully cure it. Although traditional methods like modeling and statistics are certainly helpful, a newer tool is also making an impact.

Scientists are utilizing “open science” platforms that share genomic data to help identify connections and new strains of the novel coronavirus. Already, the use of this strategy helped uncover the fact that COVID-19 had been running rampant in Seattle for weeks longer than researchers thought.

Moving forward, it could help shape the world’s counterattack as humanity seeks new ways to fend off the virus.

Making Comparisons

A key organization in the fight against COVID-19 has been the Seattle Flu Study. Its work, both with testing and research, has helped the U.S. mount its defense against the virus. At the end of February, the organization shared genomic data about a strain of the coronavirus that a Seattle teenager contracted to Gisaid, an open-source science platform.

Shortly after, researchers from another open science project, Nextstrain, found that the particular strain was the direct descendant of the virus that infected an unrelated patient also from the Seattle area. That connection revealed that the virus was spreading in the community for almost a month longer than previously believed.

In the days since that data has helped shape the public response to COVID-19. Since viruses like this one spread by copying themselves, there is a chance for errors to occur with each transmission. These genetic “typos” help researchers track unique strains and learn how they spread.

Open-source platforms like Nextstrain help researchers follow the evolution of pathogens like the coronavirus. They depend on data shared by scientists from around the world. Nextstrain gives researchers access to the data far sooner than waiting for it to arrive in a journal.

Kristian G. Andersen, a computational biologist from Scripps Research, says, “Nextstrain can be used to give a quick snapshot of how the virus has spread across regions and how local outbreaks are connected.”

Haste Makes Mistakes

Unfortunately, open-source platforms (and the conclusions they help foster) aren’t always reliable. When it comes to sensitive medical research and analysis, accuracy is key. The hasty nature of applications like Nextstrain opens the door for errors to occur.

For instance, on March 3, the platform’s co-founder Trevor Bedford concluded that a strain of the coronavirus circulating in Italy was related to one found in Germany—one that health officials believed was contained.

However, virologists were quick to respond, claiming that the evidence was “not sufficient to claim a link between Munich and Italy.” Rather, they argued that the strain may have arrived in both places from one outside source.

Bedford apologized for his inaccurate conclusion, saying, “Nonprofessionals will certainly sometimes misinterpret the information on Nextstrain, but I strongly believe that we’re pushing things toward more accurate public information. I absolutely believe that transparency is the best thing for global public health to be aiming for right now.”

To that end, he has a point. Transparency is a crucial tool in the fight against the pandemic spreading worldwide. If scientists can access data sooner by using open-source platforms, then we may be able to respond more appropriately. However, reliance on such tools must be paired with extra caution to ensure that conclusions are legitimate.

As of now, the CDC isn’t using platforms like Nextstrain to significantly shape how it responds to the COVID-19 outbreak. However, as the pandemic continues to evolve, that could change.



Please enter your comment!
Please enter your name here