scienceAI-Enhanced

May 26, 2026

Rocklin Lab Unveils Extensive Protein Stability Dataset to Enhance Biomolecular AI

Rocklin Lab Releases Megascale Open Protein Stability Dataset to Advance Biomolecular AI

Businesswire

·5 min read·Gabriel Rocklin Sergey Ovchinnikov Kotaro Tsuboyama Yehlin Cho Rocklin Lab

Rocklin Lab Releases Megascale Open Protein Stability Dataset to Advance Biomolecular AI

Image: Businesswire

💡 In a Nutshell

The Rocklin Lab at Northwestern University has released the MGnify Stability Dataset, which contains folding stability measurements for 1.8 million protein domains. This dataset aims to improve machine learning models for protein stability prediction and is supported by the OpenFold Consortium.

◆🔑 Key Points

01The MGnify Stability Dataset includes stability measurements for 1.8 million diverse protein domains, significantly expanding the available data for protein stability research.
02This dataset provides crucial negative data on unstable proteins, essential for training machine learning models to distinguish between stable and unstable sequences.
03The study was led by Gabriel Rocklin and Sergey Ovchinnikov, with contributions from co-lead researchers Kotaro Tsuboyama and Yehlin Cho.
04The predictive models developed, SaProtΔG and ESM3ΔG, demonstrate improved accuracy in predicting stability for small protein domains compared to previous models.
05OpenFold aims to support the development of open, high-quality experimental datasets to advance biomolecular AI for drug discovery and biological research.

In-Article Ad

✎📝 Full Summary

The Rocklin Lab at Northwestern University has announced the release of the MGnify Stability Dataset, a comprehensive resource containing folding stability measurements for 1.8 million diverse protein domains. This initiative, supported by the OpenFold Consortium, addresses the critical need for both stable and unstable protein data, which is often lacking in existing biological datasets. The dataset was created using advanced experimental techniques and is designed to enhance the accuracy of machine learning models for predicting protein stability. Led by Gabriel Rocklin and Sergey Ovchinnikov, the research team included co-lead researchers Kotaro Tsuboyama and Yehlin Cho, who developed predictive models, SaProtΔG and ESM3ΔG, that effectively leverage this extensive dataset. These models not only predict stability but also recover trends associated with thermophilic organisms and improve the differentiation between stable and unstable proteins. The dataset is crucial for the ongoing development of open biomolecular AI, as it provides the foundational data necessary for advancing predictive capabilities in protein engineering and drug discovery.

In-Article Ad

##️⃣ Key Figures

1.8 million

Number of diverse protein domains in the MGnify Stability Dataset

60–80

Length of protein domains included in the dataset, measured in amino acids

5 kcal/mol

Approximate resolution of experimental stabilities in the dataset

!❗ Why It Matters

The MGnify Stability Dataset will significantly enhance the ability of researchers to predict protein stability, which is vital for drug discovery and biotechnology applications.

👥 Who is affected

Researchers in the fields of biology, drug discovery, and protein engineering will benefit from improved predictive models.

ℹ️ What to know

Researchers should access the MGnify Stability Dataset to leverage its data for advancing their studies in protein stability.

In-Article Ad

?❓ FAQ

The MGnify Stability Dataset is a large-scale experimental resource that includes folding stability measurements for 1.8 million protein domains, aiding in the prediction of protein stability.

SaProtΔG and ESM3ΔG are predictive models developed from the MGnify Stability Dataset that enhance the accuracy of predicting stability for small protein domains.

✦

Reader Poll

Advanced AnalyticsAnalytics

How important do you think open datasets are for advancing biomolecular AI?

Very importantSomewhat importantNot importantNot sure

Connecting to poll...

Read the original article

Visit the source for the complete story.

Read Original

Rocklin Lab Unveils Extensive Protein Stability Dataset to Enhance Biomolecular AI

Topics in this story

Reader Poll

Related Stories

Nuclear Explosion Creates Unique Crystal Structure Unseen on Earth

Study Reveals How Low pH Disrupts Cellular Transport and Golgi Apparatus Function

Ancient Radio Signal Offers Insights into the Universe's Early Formation

Starship V3's Heat Shield Impresses During Flight 12 Test

Russian Orbital Station to Conduct 30 Scientific Experiments, Says Roscosmos CEO

Popular Topics

Rocklin Lab Unveils Extensive Protein Stability Dataset to Enhance Biomolecular AI

Reader Poll

Read the original article

Related Stories

Nuclear Explosion Creates Unique Crystal Structure Unseen on Earth

Study Reveals How Low pH Disrupts Cellular Transport and Golgi Apparatus Function

Ancient Radio Signal Offers Insights into the Universe's Early Formation

Starship V3's Heat Shield Impresses During Flight 12 Test

Russian Orbital Station to Conduct 30 Scientific Experiments, Says Roscosmos CEO

Popular Topics

🔔 Never Miss a Story