Can Perplexity Reliably Answer Technical Questions in the Life Sciences?

Carlton Hoyt 3:40 pm , October 15, 2024

The generative text AI tool Perplexity has rapidly gained popularity in the life sciences for its ability to show its references when answering. As those references can include scholarly articles, it has great potential as a literature review assistant for scientists. AIs are well-known for both hallucinations and inaccuracy, however, and being incorrect even a small percentage of the time is normally not something a professional scientist can allow.

We therefore decided to set up a little experiment: we asked three subject matter experts in the life sciences to ask perplexity a series of questions of increasing difficulty, then evaluate the answers:

A question which a grad student should be able to answer relatively easily
A question which most PIs in the field would be able to answer
A question requiring very specific knowledge to answer. Something which a scientist wouldn’t know unless it was within their specific research area.

Here’s what they asked, what Perplexity had to say, verbatim, and the scientists’ take on Perplexity’s responses.

"Some quick notes on this experiment: The scientists were not asked to proof / correct Perplexity’s responses, but rather just to provide overall feedback, which is found below the questions. They all used the free version."

Steve Perrault, PhD, former Associate Director of Research & Development at Foundation Medicine

Q1: What cell type is targeted by mRNA vaccines?

mRNA vaccines primarily target and deliver their mRNA cargo to antigen-presenting cells (APCs), particularly dendritic cells and macrophages.

Upon administration, the mRNA encapsulated in lipid nanoparticles is taken up by APCs. Once inside the APCs, the mRNA is released into the cytosol and translated into the encoded antigenic protein.

This antigenic protein is then processed and presented on the surface of the APCs in the context of major histocompatibility complex (MHC) molecules, allowing for the activation of antigen-specific T cells and initiation of an adaptive immune response.

The ability of mRNA vaccines to efficiently target and activate APCs in vivo has been demonstrated in studies using non-human primates like rhesus macaques.

These studies have shown that modified mRNA vaccines can effectively deliver their mRNA cargo to APCs, leading to robust antigen expression and subsequent activation of antigen-specific T cells.

Furthermore, the size of the lipid nanoparticles used for mRNA delivery plays a crucial role in their trafficking and uptake by APCs. Studies have shown that nanoparticles within a specific size range (around 20-100 nm) can efficiently drain into lymph nodes and penetrate the paracortex region, where APCs reside, facilitating their uptake and antigen presentation.

In summary, mRNA vaccines primarily target and deliver their mRNA cargo to antigen-presenting cells, particularly dendritic cells and macrophages, enabling the expression of the encoded antigenic protein and subsequent activation of antigen-specific T cells, which is a key mechanism underlying their immunogenicity. mRNA vaccines primarily target and deliver their mRNA cargo to antigen-presenting cells (APCs), particularly dendritic cells and macrophages.

Upon administration, the mRNA encapsulated in lipid nanoparticles is taken up by APCs. Once inside the APCs, the mRNA is released into the cytosol and translated into the encoded antigenic protein.

This antigenic protein is then processed and presented on the surface of the APCs in the context of major histocompatibility complex (MHC) molecules, allowing for the activation of antigen-specific T cells and initiation of an adaptive immune response.

The ability of mRNA vaccines to efficiently target and activate APCs in vivo has been demonstrated in studies using non-human primates like rhesus macaques.

These studies have shown that modified mRNA vaccines can effectively deliver their mRNA cargo to APCs, leading to robust antigen expression and subsequent activation of antigen-specific T cells.

Furthermore, the size of the lipid nanoparticles used for mRNA delivery plays a crucial role in their trafficking and uptake by APCs. Studies have shown that nanoparticles within a specific size range (around 20-100 nm) can efficiently drain into lymph nodes and penetrate the paracortex region, where APCs reside, facilitating their uptake and antigen presentation.

In summary, mRNA vaccines primarily target and deliver their mRNA cargo to antigen-presenting cells, particularly dendritic cells and macrophages, enabling the expression of the encoded antigenic protein and subsequent activation of antigen-specific T cells, which is a key mechanism underlying their immunogenicity.

Q2: How do mRNA vaccines escape endosomes?

mRNA vaccines are delivered into cells encapsulated in lipid nanoparticles (LNPs). For the mRNA to be effective, it must escape from the endosomes into the cytoplasm where it can be translated into proteins. The endosomal escape process is facilitated by the ionizable lipids in the LNP formulation:

Upon endocytosis, the LNP becomes trapped in an endosome, which is an acidic compartment within the cell

The acidic environment of the endosome causes the ionizable lipids in the LNP to become positively charged

This positive charge allows the LNP to interact with and disrupt the negatively charged endosomal membrane through electrostatic interactions

The shape change and disruption of the endosomal membrane enables the release of the mRNA cargo from the endosome into the cytoplasm

Optimizing the pKa values and lipid tail structures of the ionizable lipids can enhance their ability to become protonated and facilitate endosomal escape

Early endocytic compartments like recycling endosomes have been identified as key sites for productive mRNA escape, whereas prolonged retention in late endosomes leads to mRNA degradation

So in summary, the ionizable lipids undergo a charge-induced shape change in the acidic endosome that allows the LNP to disrupt the endosomal membrane and release its mRNA cargo into the cytosol for translation

Q3: What’s the structure of the ionizable lipid used for endosome escape with mRNA vaccines?

The ionizable lipids used in mRNA vaccines like the Pfizer and Moderna COVID-19 vaccines have a unique structure that facilitates endosomal escape and delivery of mRNA into cells.

Key Structural Features:

Ionizable Head Group: The lipids contain a protonatable or ionizable amino head group, typically a tertiary amine. This allows the lipid to be neutral at physiological pH but become positively charged in the acidic environment of endosomes.

Branched Lipid Tails: The lipids have branched or multi-tail lipid chains, rather than linear tails. This branched structure is believed to promote a cone-shaped molecular geometry that aids in endosomal membrane destabilization and escape.

Pfizer’s ALC-0315 has four short branched tails.

Moderna’s SM-102 has three branched tails, one of which is longer.

Ester Linkers: The lipid tails are often connected to the head group via ester linkers, which can influence the biodegradability and potency of the lipid.

The combination of the ionizable head group, branched tails, and ester linkers allows these lipids to effectively complex with mRNA, facilitate endosomal escape through the proposed “proton sponge” effect and membrane destabilization, and enable cytosolic delivery of the mRNA cargo.

Bonus historical science question: How did DNA nanotechnology start?

DNA nanotechnology originated from a landmark theoretical paper published in 1982 by Ned Seeman, which proposed the idea of using DNA molecules to create nanostructures.

This seminal work laid the foundation for the field and sparked the development of various techniques and applications in DNA nanotechnology.

The key idea proposed by Seeman was to exploit the predictable base-pairing interactions of DNA to construct artificial nanostructures with precise control over their shape and size.

He envisioned using synthetic DNA strands designed to self-assemble into desired geometries, taking advantage of DNA’s inherent programmability and molecular recognition capabilities.

The first experimental demonstration of this concept came in the early 1990s when Seeman and his colleagues successfully constructed DNA-based nanostructures, such as a cube and a truncated octahedron.

These initial achievements paved the way for the rapid growth of DNA nanotechnology, leading to the development of various techniques like DNA origami, DNA tiles, and DNA nanorobots.

Over the years, DNA nanotechnology has evolved into an interdisciplinary field, bringing together researchers from diverse areas like chemistry, physics, materials science, and computer science to explore the potential of DNA-based nanostructures for various applications, including nanoelectronics, biosensing, drug delivery, and molecular computing.

Steve’s take on Perplexity:

Steve said Perplexity was “pretty solid” but criticized it for a “lack of context, like if you’re in that field and wanting to understand a bigger picture based on a question, that isn’t there.”

But on a super reductionist view it seems pretty good. Like, what someone might give on an exam answer after studying, but not what someone who really knows a field should hopefully say to provide that bigger picture.

I don’t see anything “wrong” but someone who really knows their stuff would give a broader answer. This comes across as a pre-med major who’s really good at studying, and the tech details are accurate.

For the second two questions, if someone asked me this I would give a context of this solving a problem that had been around since 1960s, was solved by someone at UBC, how it was solved, why it matters. Things that provide way more depth.

Regarding the bonus historical question on the origins of DNA nanotechnology which he decided to ask, he said: “This is accurate but misses the nuances that one can only know from really diving into it. I wouldn’t expect AI to come up with that. Probably not possible. But the technical details are right.”

Hussain Dahodwala, PhD, Associate Professor at the IBBR Institute for Bioscience and Biotechnology Research and former Director, Upstream Process Development at the National Institute for Innovation in Manufacturing Biopharmaceuticals (NIIMBL)

Q1: What are the challenges in enzyme engineering and what advances are anticipated in enzyme engineering for biobetters?

The key challenges and advances in enzyme engineering for biobetters are:

Challenges in Enzyme Engineering:
• Enzymes often have poor stability under industrial conditions, such as high temperatures, pressures, and organic solvents. This leads to poor product yields.
• Enzymes may have narrow substrate specificity, limiting their ability to work with non-natural substrates.
• Enzymes can be difficult to produce at scale in industrial fermentation processes.

Advances in Enzyme Engineering:
• Enzyme engineering techniques like directed evolution, rational design, and semi-rational design are being used to improve enzyme properties such as activity, specificity, stability, and stereoselectivity.
• Computational methods and machine learning are helping to expand the enzyme design space and accelerate the discovery of novel enzymes with desired properties.
• Strategies like enzyme immobilization, molecular modifications, and the use of nanoparticle or cell-based delivery systems are being explored to enhance the stability and performance of enzyme therapeutics.
• Advances in biotechnology and recombinant expression are enabling the industrial-scale production of engineered enzymes.

Overall, enzyme engineering is a rapidly advancing field that is critical for developing efficient, stable, and versatile biocatalysts to enable sustainable and cost-competitive bio-based processes.

Q2: What is the inspiration for Biomanufacturing 4.0 and how is it different from current biomanufacturing?

Biomanufacturing 4.0, also known as Pharma 4.0, is inspired by the broader Industry 4.0 concept and is focused on the digital transformation of biomanufacturing processes. The key differences between Biomanufacturing 4.0 and current biomanufacturing practices are:

Increased use of advanced digital technologies: Biomanufacturing 4.0 leverages technologies like artificial intelligence, machine learning, digital twins, and advanced process control to enable greater process visibility, optimization, and predictive capabilities.

Emphasis on data-driven decision making: Rather than relying solely on empirical knowledge, Biomanufacturing 4.0 aims to incorporate data analytics and modeling to make more informed, data-driven decisions throughout the product lifecycle.

Shift towards continuous improvement: Biomanufacturing 4.0 promotes a culture of continuous improvement, enabled by the increased process understanding and agile regulatory frameworks that allow for more efficient implementation of process changes.

Enhanced supply chain robustness: The advanced process control and predictive capabilities of Biomanufacturing 4.0 can lead to improved supply chain reliability and the ability to better manage variability, ultimately enhancing product quality and availability.

Accelerated product development: The digital tools and data-driven approach of Biomanufacturing 4.0 can help compress product development timelines, allowing for faster access to life-changing therapies.

In summary, Biomanufacturing 4.0 is inspired by the broader Industry 4.0 vision and aims to transform biomanufacturing through increased digitalization, data-driven decision making, and a focus on continuous improvement and supply chain robustness.

Q3a: What are non-canonical redox co-factors and how do we use them in cell culture?

Non-canonical redox cofactors (NRCs) are chemical compounds that can serve as alternatives to the natural cofactors like NAD(P)+ and NAD(P)H in biological redox reactions. Some key points about NRCs:
NRCs can overcome the limitations of natural cofactors, enabling specific delivery of reducing power outside of biological constraints and providing alternative reduction potentials to overcome thermodynamic barriers.

Examples of NRCs include 3-carbamoyl-1-phenethylpyridin-1-ium chloride (P2NA+) and 3-acetylpyridine adenine dinucleotide, which have been used in purified enzyme-based redox catalysis to increase economic viability and drive thermodynamically unfavorable reactions, respectively.

Researchers have developed methods to engineer microbes like E. coli to biosynthesize NRCs like nicotinamide mononucleotide (NMN+) to enable their full integration with renewable biomanufacturing processes. This involves developing growth-based screening platforms and exploring pathway combinations to achieve high intracellular NMN+ levels.

Using NRCs in cell culture can provide more economical and scalable biotransformation, as well as enable exquisite control and expand the range of chemistries accessible compared to natural cofactors. However, the impact of NRC accumulation on cell fitness must also be considered.

In summary, NRCs are emerging as important tools in cell-free biosynthesis and metabolic engineering to overcome the limitations of natural cofactors and enable new capabilities in biocatalysis and biomanufacturing

Perplexity did not attempt to answer the second part of the question, and Hussein made another attempt to get it to answer.

Q3b: How to use non canonical redox co-factors in cell culture?

Based on the search results, here is how non-canonical redox co-factors can be used in cell culture: Metabolites and post-translational modifications (PTMs) can regulate cell signaling and gene expression through non-canonical mechanisms. Metabolites can serve as co-factors for enzymes and modify proteins through PTMs like palmitoylation, glutathionylation, and acetylation. These PTMs can alter protein interactions, stability, and function, thereby aligning cellular processes with changes in metabolic state. For example, glutathionylation of the transcription factor NF-κB decreases its ability to bind DNA, linking redox state to gene expression. Histone modifications like lactylation and crotonylation can also regulate gene transcription in response to metabolic cues. Additionally, palmitoylation can increase protein stability by blocking ubiquitination and proteasomal degradation. Several studies have demonstrated the use of non-canonical redox co-factors to modulate cell signaling and death pathways in cell culture:

Dihydroartemisinin, a semi-synthetic derivative of artemisinin, induced ferroptosis in pancreatic cancer cells by altering redox homeostasis and lipid peroxidation.

The natural product 11-methoxytabersonine triggered necroptosis in glioblastoma cells through ROS generation, calcium signaling, and mitochondrial dysfunction.

Xanthohumol, a chalcone from hops, activated the Nrf2 antioxidant pathway while inhibiting NF-κB in pancreatic cancer cells, leading to reduced inflammation and oxidative stress.

Sulforaphane, an isothiocyanate from cruciferous vegetables, upregulated Nrf2 and antioxidant enzymes while downregulating NF-κB in breast cancer and macrophage cell lines.

In summary, non-canonical redox co-factors like metabolites and their PTMs can be leveraged in cell culture to modulate signaling pathways, gene expression, and cell death mechanisms in a targeted manner by exploiting their unique biochemical properties.

Hussain’s take:

These are questions that I regularly encounter in scientific conferences and use them to test graduate student knowledge. I can say that these questions can be answered by biotechnologists with a few years of experience.

The questions chosen were 2 parts each. One part would require” book learning” and the second part would require domain knowledge, some experience and “intelligence” in order to answer.

In all 3 examples: Perplexity AI did an acceptable job in text mining the first half of the question. The definitions and concepts asked were answered more accurately and with more clarity, including citation than one would expect humans to remember.

Complexity of jargon was increased with each question and in each case Perplexity AI was able to text mine the definitions. For reference, in my experience many grad students fumble at the third question.

However in every question, perplexity AI ignored the second half of the question beyond the definitions. In other words it does great at “What is” and fails at “how to”.

Separating the questions was attempted with the third question, and AI failed to give an answer.

Chris Bjornsson, PhD, Embedded 3D & Automated Microscopy Specialist at ZEISS.

Q1: How many tissue types are there in the human body?

There are four main types of tissues in the human body: epithelial, connective, muscle, and nervous tissues.

Epithelial Tissues
Epithelial tissues are sheets of cells that cover exterior surfaces of the body, line internal cavities and passageways, and form glands. Their main functions include protection, secretion, absorption, and excretion.

Connective Tissues
Connective tissues bind together and support other tissues. They include bone, cartilage, blood, adipose (fat), and fibrous connective tissues like tendons and ligaments. Their main functions are to bind, support, protect, insulate, store energy, and produce blood cells.

Muscle Tissues
Muscle tissues are composed of cells called muscle fibers that have the ability to contract or shorten, allowing for movement. There are three types: skeletal (voluntary), smooth (involuntary), and cardiac (heart muscle).

Nervous Tissues
Nervous tissues are composed of neurons and neuroglial cells that propagate electrochemical signals called nerve impulses, enabling communication within the body and the processing of responses. The different tissue types work together to perform the various functions required for the human body to operate and maintain homeostasis.

Q2: How many orders of light does it take to form a microscope image?

To form a microscope image, it takes at least two orders of light according to the Abbe theory of image formation. The key points are:

According to Ernst Abbe’s theory, for a specimen detail to be resolved in the microscope image, the objective lens must capture at least two orders of diffracted light from the specimen, such as the 0th order (undiffracted light) and the 1st order of diffraction.

The more diffracted orders that are captured by the objective lens, the more accurately the image will represent the original specimen details.

Using immersion oil between the objective lens and specimen allows more diffracted orders to be captured compared to air, resulting in higher resolution.

Blue light is diffracted at a smaller angle than red or green light, so objectives can capture more diffracted orders of blue light, further improving resolution.

So in summary, capturing just the 0th order undiffracted light is insufficient to form a resolved image. At minimum, the 0th and 1st diffracted orders must be captured by the objective lens to resolve specimen details according to Abbe’s theory.

Q3a: How does the wavelets model work to create a maximum intensity projection?

Wavelets are not directly used to create maximum intensity projections (MIPs) in medical imaging. MIPs are created by taking the maximum voxel value along a particular viewing direction and projecting it onto a 2D image plane. The wavelet transform is a mathematical tool that decomposes a signal (like an image) into different frequency components at different scales or resolutions. It can be useful for tasks like image denoising, compression, and feature extraction, but it is not the primary technique used for MIP rendering. MIPs are typically computed directly from the 3D volumetric image data by the following steps:

The viewing direction is chosen (e.g. axial, coronal, sagittal).

For each pixel in the 2D projection plane, the maximum voxel value is found along the corresponding ray through the 3D volume in the chosen viewing direction.

This maximum value is assigned to that pixel, creating the MIP image.

So in summary, while wavelets have many applications in image processing, the maximum intensity projection algorithm operates directly on the voxel data itself by finding the maximum value along rays through the volume. Wavelets are not an inherent part of the MIP rendering process.

Q3b: How does the wavelets model work to create an extended depth of focus?

The wavelet model works to create an extended depth of focus (EDF) image by combining in-focus regions from a stack of images captured at different focal planes using the wavelet transform. The key steps are:

A stack of images is acquired by capturing the same scene at different focal planes along the optical axis.

The wavelet transform is applied to each image in the stack, decomposing them into different frequency subbands at multiple scales.

For each spatial location and wavelet subband, the coefficient with the maximum absolute value across the image stack is selected. This selects the in-focus information from the stack.

The selected wavelet coefficients are used to reconstruct a composite wavelet representation, which corresponds to an EDF image containing the in-focus regions from across the stack.

Optional post-processing steps like enforcing local smoothness of the depth map and avoiding saturation/noise accumulation can further improve the results.

The key advantage of using wavelets is that they provide a multi-scale representation that can effectively isolate in-focus and out-of-focus regions based on their different frequency characteristics. The maximum selection rule across the stack allows combining the in-focus information into a single composite image with an extended depth of field. This wavelet-based approach, especially the complex wavelet variant, has demonstrated state-of-the-art performance for EDF microscopy imaging compared to other traditional algorithms.

Chris’ take:

I’m impressed! I even tried to mess it up on the 3rd question but it caught me. Fast, too. I’ll actually start using it and spreading the word!

There was one error, i should mention, when it comes to how many orders of light are needed to form an image…it can be any two orders, and doesn’t necessarily need to include the 0th order. It can be 0+1 or 1+2 or 2+3 etc… The first part of that answer was correct, but the final summary statement was wrong.

Takeaways

Perplexity isn’t perfect, and is likely better at answering “what is” type questions than “how to” type questions. If you want to get a lot of context behind the answers, you’ll likely have to dig a bit. However, in these experiments it did seem to be relatively accurate, with few obvious errors. Used with caution, it can make a solid research companion and speed up literature review considerably.

"Scientists are increasingly using AI, which means that you need to be planning for their AI-enabled future. You’ve heard of search engine optimization, but how much do you know about AI optimization? Don’t be intimidated, just partner with BioBM. We stay at the forefront of organic optimization techniques so that you can stay at the forefront of scientists’ product and service discovery. Contact us today."