MIT Researchers Use Song-Representations of Protein Sequences to Develop New Proteins

Dev Kapadia 23′

Figure 1:
The graph above depicts the vibrational spectrum of carbon monoxide (CO) with the frequency on the x-axis and the intensity of vibration on the y-axis. As shown, the presence of different electron energy shifts when changing quanta, the energy levels within the atom, causes the bands of intensity of vibration at different frequencies.
(Source: Wikimedia Commons)

To most, there is not an obvious intersection between science and music beyond when researchers listen to their favorite music while working in a lab. But to Markus Buehler and his team at the Massachusetts Institute of Technology (MIT), aspects of music such as volume, speed, and the number of melodies played at once can actually represent complex biological processes such as protein sequencing.1

The similarities between the vibration of musical instruments and the atoms in a molecule inspired the MIT team to investigate how to represent protein sequences using music. Generally, each atom in a molecule vibrates in a periodic motion. Molecular vibrations are altered with energy absorption and subsequent emission due to energy changes of the electrons as they move from different energy levels in the atom. The accumulation of these energy changes creates bands of vibrational intensity at a specific range of frequencies for a molecule called the vibrational spectrum. Buehler’s team analyzed the vibrational spectrum for many molecules making up protein structures and converted them into an audible sound.2 The team also considered the structural features of the proteins to alter the tunes and further differentiate between sequences. For example, for a protein that is more closely packed, the sequence would be represented by a more rapid succession of notes. If structures are more spread-out and less dense, the notes would be played more slowly and smoothly. The scientists even factored in overlapping sections of more complex proteins by representing these portions using counterpoints, which occur when a melody is played against another melody.1

Figure 2:
The images above depict the two possible secondary structures of proteins. In the sonification method produced by the team at MIT, the beta sheets will have a smoother and slower melody because of the less-dense structure. The alpha helix, on the other hand, will be represented by a quicker, more rapid melody due to the more constricted and tighter structure of the protein.
(Source: Wikimedia Commons)

The team was then able to incorporate the musical representations of over 100,000 processed proteins into a neural network to be used by an artificial intelligence algorithm. In a neural network, the algorithm gives “weights” to nodes that hold inputted data points in training. When the neural network is activated, these weights are used to determine whether or not a data point should be included in the output sequence based on the previous point included in the sequence. These weights are simply numbers that are adjusted so outputs will best represent the training data, which in this case are the musical representations of the sequenced proteins.3 This neural network was thus able to produce new rhythms within a set variation of the training data, thereby producing entirely new proteins.1

In order for these new proteins to be used for drug development, enzyme optimization, and a variety of other benefits, the proteins must be viable. To test this, the researchers used the rhythms developed by the artificial intelligence algorithm to build the new protein models atom-by-atom.1 Once these models were built, the team analyzed bond stability, potential environmental conditions, and a variety of other characteristics to determine the proteins’ stability.4

Dr. Buehler acknowledges how scientists have previously used sonification, the process of converting information into sounds, in prior research, and he plans to augment his team’s current sonification method for proteins by incorporating bends and more complex folding of proteins.1 Since the potential benefits of this method of protein production hinges on the viability of the designed proteins in the target environment, the team will continue to examine the structure of the designed proteins through comparison with current, viable proteins or laboratory tests. Ultimately, Dr. Buehler believes that the interdisciplinary field of science and music will have a multitude of applications in the future once it becomes more widely accepted by the greater scientific community.1


[1] Cowen, R. (2020, March 18). Amino Acid Rock Music Helps Build New Proteins. Retrieved March 29, 2020, from

[2] Qin, Z., & Buehler, M. J. (2019). Analysis of the vibrational and sound spectrum of over 100,000 protein structures and application in sonification. Extreme Mechanics Letters29, 100460. doi: 10.1016/j.eml.2019.100460

[3] Hardesty, L., & MIT News Office. (2017, April 14). Explained: Neural networks. Retrieved March 29, 2020, from

[4] Deller, M. C., Kong, L., & Rupp, B. (2016). Protein stability: a crystallographer’s perspective. Structural Biology Communications72, 72–95. doi: 10.1107/S2053230X15024619

Bookmark the permalink.

Leave a Reply

Your email address will not be published.