Wednesday, March 22, 2017

MvM workflow -- Combine DDA with DIA!



This one takes a second to wrap your brain around -- but...to get to proteins that are only estimated to be expressed at 50 copies/cell(!!) it is worth it.

The paper is brand new and can be found at JPR here.


The basic idea is that if you run your normal DDA (TopN-type) experiment, you can break the peptides coming off the column into 3 groups:

Group 1 -- Fragmented and identified in all runs and any label free algorithm will give you amazing quantification

Group 2 -- Fragmented in a few, but not all runs. Identified, but you'd have to infer (or impute)  their identity from MS1 only in the other runs

Group 3 -- Peptides you never fragment that are just too low in abundance to ever crack the N-most intense in your TopN experiment

The MvM strategy (Missing Value Monitoring) specifically focuses on Group 2. You have this subgroup of peptides that have been identified -- which means you have a Peptide Spectral Match (PSM) that you can use to create a spectral library.

If you then run DIA on every file you can use the spectral libraries you made to quantify the peptides with missing values across all of your runs.

To test this strategy, this group uses a QE (the paper says QE HF, but the method section uses resolutions that show it as a QE Classic or Plus) on a yeast cells during different stages in their developmental cycle or something. They are able to get incredible depth, with even lowest abundant proteins being quantified in all samples.

Up-side -- This approach doesn't use any funky software and you get much better label free quan!
Down-side -- You need to run every sample for both DDA and DIA.

I really like this paper because it is a clever approach I haven't considered before. If the queue in your freezer seems to be growing at an ever faster rate, this might not be the LFQ method of your dreams ;)

But...if you have the available instrument time that you could run each sample twice, this might be a great method to consider!

Tuesday, March 21, 2017

Prosight Lite! Free top-down analysis of single proteins.




Now that 20 papers are out that have cited the use of Prosight Lite it may be time that actually link the paper on the blog -- as a partial thank you for how often I use this awesome free resource!

I'm too lazy to search the blog for some of the older posts on the software and I'm too busy with work to write a real post for Tuesday, so here is the paper for Prosight Lite!


Monday, March 20, 2017

Great review on structural proteomics techniques!


I've never done HDX-MS before. I think the idea is fascinating, but despite reading lots about it over the years -- well, I forget the key points.

This little review is awesome for linking stuff I do know well and stuff I don't and making a cohesive unit out of the big picture -- why we'd do this in the first place!

Even better? It is a great introduction for people who might be new to all of this -- good enough I'm gonna add it to the "Resources for Newbies" page over there -->

Shoutout to @KyleMinogue who is NOT this person! I checked.



Saturday, March 18, 2017

Great slide show on data storage and standardization!


Two great things come from following this link!

The first is (finally) something useful that came from LinkedIn!

The second -- is the great slide deck that walks you through challenges and perspectives in relation to proteomics data storage and meta-analysis!

Friday, March 17, 2017

Should you be using lower collision energy for PRM experiments?


Okay...so I was running my mouth again about how PRMs on a Q Exactive could beat SRMs for a QQQ and had to blow a weekend in lab running stuff to prove it to a bunch of skeptical people.

Caveats here for why I made this very costly dare (I probably only have a few thousand weekends left in my whole life after all...)

This researcher has only one peptide that he can use to confirm a positive outcome for this assay. One peptide. (Plus controls and whatever, of course).

There will be pressure for the LC-MS assay to be as short as possible.

The matrix is...whole...digested...human...plasma (or serum or whatever. A friend told me there was a difference yesterday and I still don't know what it is)

If you've got a protein you can get 3 peptides from for this, okay -- a QQQ might be the better choice for this assay -- but if you've just got one? I'm going PRM all day and never consider the QQQ.

I can't show the actual data cause I signed a thing that looked seriously scary. But I can tell you this -- there were so many peptides in the close mass range of this peptide in the digest on a 20 minute gradient that there was no way I could even trust SIM -- even at 70,000 resolution (max I had on the instrument I used) -- nope.  HAD to be PRM.

And -- when I was looking for fragment ions for my quantification (btw, I just extracted with Xcalibur and I believe it sums the fragment intensities rather than averages them -- but I'm not 100% - the peptides look great in Skyline as well) there was enough coisolation interference at with a 2.2 Da window that I couldn't use anything in the low mass range at all.

With this information I created the super-scientific scale that you see at the top of this post.  I really had to go to high mass fragment ions for specificity in my quan (and the best possible signal to noise!)  How complex is the matrix -- that with a 2.2Da isolation window there are smaller peptides you can't trust -- extracted at 5ppm...?

And, you know what? I could boost the intensity of these big fragment ions by dialing the collision energy back some.  Not a huge boost, but dropping the nCE down to 25 might have picked me up 10-20% in this particular assay for this particular peptide. (Your results may differ)



Let's check some experts!

I went to ProteomeXchange and searched "PRM" and downloaded some RAW data at random from a couple studies out there....and...I totally "discovered" settings I should have been using the whole time....yeah...you should probably use a little less collision energy for your PRMs!

The first 2 studies I pulled...used...25! (PXD003781 and PXD001731). 2 other studies -- RAW files completed just as I was wrapping this up appear to have used 27.  We're at 50/50, but my peptide really liked lower energy.

Side note -- these samples were given to a lab that ran them on a QQQ that would cost this researcher MORE than the Q Exactive I used, LOL!

BTW, the  QQQ lost again. In ultra complex matrices where QQQ is going to lose the S/N game -- and you don't really need the 500 scans/second -- what you need is certainty that what you are quantifying is the correct compound -- my money is on PRM. And -- holy cow -- if you can save money getting a Q Exactive over a QQQ for the assay....

Thursday, March 16, 2017

High precision prediction of retention time for improving DIA!


We've have peptide retention time in silico predictors for at least 15 years - and sometimes they work great. I don't think it is controversial at all to say that real peptide standards work better.

This recent Open Access Paper takes a look at the difference between the two -- as well as different retention time calibration mathematical models in the context of SWATH and DIA.


And the results are pretty clear from their work -- in DIA it helps a lot to have retention time control for your identifications. With the added uncertainty of the bigger windows or having the MS1 for quan that is not directly linked by the instrument to the MS/MS fragments -- this is really valuable.

Also, this paper is great because it highlights how ridiculously great the Q Exactive Classic is for DIA. They can get over 10% more protein IDs with their high precision iRT model, pushing standard 2 hour DIA on human digests from 4,500 protein groups up to 5,000 protein groups!

5,000 protein groups in 2 hours from human digest!!!!!  I need to do more DIA....


Wednesday, March 15, 2017

Cell wide analysis of protein thermostability!


Okay --- I've GOT to get out the door before 5 if I've got any shot of making it to my talk at the NIH this morning...

BUT...I've got to say something about this AWESOME NEW PAPER IN SCIENCE!


Man, THIS is a Science paper. One of those things where you're scratching your head wondering -- "um...okay...why would we even want to know that...?...but that was a really smart way of doing it and I bet something will come out of it!"

Its 4:47!  I've gotta steal @ScientistSaba's notes (thanks!) on the paper and go!

It uses "LFQ to explore thermostability on a proteome-wide scale in bacteria, yeast, and human cells by using a combination of limited proteolysis and MS...The group maps thermodynamic stabilities of more than 8000 proteins across these 4 organisms. Their results suggest that temperature-induced cell death is caused by the loss of a subset of proteins with key functions." Sweet, right!?!

Worth noting, they do all the analysis with LFQ on a QE Plus using Progenesis IQ.

Tuesday, March 14, 2017

Param-Medic -- Automatic parameter optimization without database searching!

I'm honestly having trouble wrapping my brain around how this new free piece of software works -- and whether it would be an advantage over the tools I currently use for this, regardless it is an interesting read!


Somehow -- it can look at your RAW data and determine the mass accuracy settings that you ought to use for your database search, without looking at your database at all the way Preview or IMP Recalibration node does.

If you are using the Crux pipeline tools -- it has already been integrated as an option for you to check out. For the rest of us who don't want to use awesome free data processing pipelines from some guys in Seattle (what do they know about mass spec data processing anyway...), we he can download the stand-alone and run it in Python.

Monday, March 13, 2017

Awesome clinical proteomics study on weight loss!


I'm gonna be conservative and say there are about 12 reasons to read this awesome new open access paper!



I'll name a few and see how far I get

1) A "how to" for clinical proteomics. 1 hour digestion? 45 minute runs? Now -- this is something practical for a clinical setting.

1.5) This had to move up the list. The samples were prepped with a robot liquid handler thing!

2) This section title "Plasma Protein Levels Are Individual-Specific" Holy cow! Why don't I have my own plasma proteome done yet?

3) XIC based label free quan (MaxQuant LFQ) applied to a clinical sized cohort (300+ patients; over 1200 runs!)

4) Beautiful downstream analysis -- that leads to clear biological conclusions on this cohort, including inflammation response, insulin resistance, etc.,

I really think I could get to 12, but I do have a job and I should probably not be late for it!

Saturday, March 11, 2017

Ready for a new PTM to worry about? Cysteine glycosylation is all over the place in human cells!


Fun fact: Did you know that O-GlcNAc modified proteins were discovered in Baltimore over 30 years ago? See, there's more to my home town than fantastic infrastructure and friendly people!

Glycoproteomics is kind of exploding right now -- the enrichments are better, the separations are better, and the mass specs are ridiculously better, and the software has almost caught up....and I wonder if this great new paper at ACS is just the tip of the iceberg....



A whole new class of glycopeptides right under our noses! The evidence looks pretty clear cut to me -- and first analysis from this group suggests that it isn't even rare. Once they had a mechanism to enrich and a pipeline to search for them in the data they report proteins with this modification in virtually every subcellular fraction!


Friday, March 10, 2017

Changes in coffee proteomics during processing.


Want to learn a lot about coffee this morning and see some classic proteomics techniques put to good use?

Check out this new paper in Food Chem (Elsevier)

The idea? They dry coffee in different ways -- and some people have linked how they dry the coffee during processing to the quality of the coffee. Apparently, making coffee is really complicated.

So this group extracted proteins from coffee beans (btw, you need liquid N2 to extract peptides from coffee beans), did some 2D-gels and spot picked for an old MALDI-TOF to get to work on.

They find a couple dozen spots --  and can get a peptide or two from each spot for identification. Unsurprisingly they find some heat shock proteins are differentially regulated as well as a few other interesting proteins that make sense. Their next plan is to see if they can create model systems to tell if one (or more of these) are responsible for the taste difference.

I want to imagine this is how the taste test goes ---  coffee supplemented with Hsp70:


Coffee supplemented with: "homologous protein to putative rMLC Termites like superfamily protein" (another big spot on the gels)


...and now we know which one it is!


Thursday, March 9, 2017

MetaMorpheus -- Amazing software and stategy for finding all the PTMs!


I'm gong to end my blog hiatus with the best paper I read while I was out recovering -- and it's this new one out of Wisconsin!


Let's start with a minor criticism -- if you saw the title of this article in your Twitter feed you might think that this is a review on the topic of PTMs and just go right past it. And you shouldn't pass by this one.

Here is the thing -- our database tools are really good at finding peptides from unmodified proteins in our databases. If your job as a proteomics scientist is to identify peptides from model organisms with perfectly complete annotated UniProt proteins that are not regulated in anyway by PTMs you are in the clear -- we've got all the tools for you.  If, however, you are studying something that actually exists in nature (i.e., modifies virtually all of it's proteins with chemical modification combinations of some kind) it's still tough in this field.  Our tools are designed for unmodified proteins. Looking for any modification is possible -- but computationally super expensive (example).

I LOVE this paper, btw. I had worried that my enthusiasm for it had something to do with all the painkillers from my knee procedures, but -- narcotic free -- still love the paper!  Here is the idea.

1) Screen the data at a high level with a great big mass tolerance window and look for PTMs
2) If finding evidence of the PTMs -- take a FASTA and build a more intelligent FASTA (at this point it must be XML) that includes this stuff (think of it like a ProSight Flat file where instead of using biologically curated data to build your database you are building your database with PTMs on the fly with the data that you have in hand)
3) With your smart database research your RAW data with your normal tight tolerances so you get everything right.

If you're thinking -- "hey, I can do that, I have all the tools necessary on my desktop right now." You might be right. You can do a wide tolerance mass search, find all your deltaM masses, convert them to PTMs, make a better database (okay...maybe you can do that...I can't...unless now I'm firing up ProSight....building a Flat file and doing the rest of it that way....) and then research my database.

My response -- can you download a free piece of software right now -- that'll just do the whole darned thing for you? It's called MetaMorpheus and you can get it right now -- right here!


(No relation.)

Okay -- so this doesn't come without a hitch -- you are STILL doing a huge delta M search to start your program -- and even as fast as Morpheus is the search space is tragically large. For one of their human cancer digests it takes 13 days to run the project on what sounds like a seriously beefy PC...but...to really truly get to the bottom of these PTMs with ultra high confidence of their presence and their site specificity -- in one workflow...??!?  I can't wait to give this a try!!

Wednesday, March 8, 2017

Why do I hate this wine? Or...how I learned how to do metabolomics...


This is off the proteomics topic completely! Here is the thing. I have a very skeptical friend with some ridiculously cool samples. Like -- if anyone else in the world is brave enough to get these things -- I don't know who they are. And -- we talked about doing metabolomics on these samples. But before I could have these samples I needed to be able to prove that I could...well...do metabolomics.... And -- I may talk a LOT -- but I'm not gonna pretend I know how to do something especially at the risk of wasting precious samples!  I'll just spend a year of my spare time learning how to do it!

To learn the field, I did what I normally do -- I started a metabolomics blog -- and forced myself over the last year to read as many papers as I could on the topic. It is still a new field to me, so I know I've got tons to learn, but I may still link it to the right somewhere. Maybe someone will learn something from it, and I don't mind feeling dumb.

Okay -- so you can read a lot about something and that's awesome, but you need to run the instrument, clog a few columns and lose your temper with the software a few times to learn a new discipline, right? And it helps if you have some motivation....


...okay...so here is a perfectly anonymized map of  region with 15 commercial vineyards within a 20 minute or so drive of one another. My favorite wines in the world comes from this area -- amazing and in my happy range of $8-$15 a bottle even after they arrive here!

I went to such efforts to anonymize this -- cause there is an exception here. I don't like one of these vineyards. They are using the same grape varietal as everyone else. They are using the same strict rules of their appellation, in terms of how long they have to age, etc., but there is something that I really don't like about what they make.

Wine is just a mix of small molecules, right? As good of an excuse as any -- and with wine you're not exactly material limited!

Over an undisclosed time period I collected 1mL from a number of different bottles of wine from this region. The rest of each bottle was disposed of in a manner that meets the strict ethical guidelines of my undergraduate fraternity. Once a number of samples were collected, I borrowed a Q Exactive classic system with RSLC3000 from an old colleague using some vague statements and a promise to clean the S-lens later (which I totally did).

There aren't a bunch of "Q Exactive wine metabolomics app notes" but if you erase the word "wine" you're set -- I found 2 that were very similar, couldn't decide which one was better (now know this one is -- warning .PDF download) and ended up on the following methodology (used the columns and flow-rates they describe, btw - oh, and I injected 5uL of wine on column, cause why not...?)

You're basically doing C-18 separation in positive and negative just like for everything else except you're using a lower mass cutoff and +1/-1 charged ions are a good thing!  Pull that off and you are doing the instrument side of metabolomics!

Metabolomics is, however, ahead of us (in my humble opinion) in terms of the data processing in some ways. In most of the software I've tried so far they start with what is quantified -- and statistically significant between their sample sets -- THEN they care about finding out what it is. They have massive reductions in their search space by going to the XIC and throwing out all the stuff that is 1 to 1.  Who cares about the molecules that aren't changing? Not me!

To find what is significant, metabolomics software relies heavily on statistical tools.

Check this out --



This is a shot from Compound Discoverer (which, btw, is super easy to learn if you are using Proteome Discoverer 2.0 or newer).

(Look familiar?)

This is one of the first steps in analysis -- Volcano plots showing the fold change of your compounds on one scale and the P-value (!!!) on the other. You can just take your list of statistically(!!) significant changes that you find graphically and export them into a darned list!  Out of thousands of compounds detected in these weekend runs -- there are about 200 that are 10x up or down regulated with a p-value cutoff of 0.05. Wish you could do something that easy in Proteome Discoverer to get to the bottom of what is interesting...? I hope I'll have good news for you soon!

Interesting notes -- there are thousands of soluble small molecules that will stick to a C-18 column and ionize in a bottle of red wine! What?!? Initially, I'm thinking "that is way too high" but you've got small molecules from the grapes -- from the yeast -- from the wood of the barrels and stems -- so it doesn't seem that crazy...

Also -- and this is funny -- wines from the same vineyards cluster together just using PCA. Want to start a wine counterfeiter busting business on the side with your Q Exactive (if it is yours to do what you please, of course)  it is really easy to do. This is interesting to me cause anything you read on that stuff is done with big FTICRs -- and they -- and they're hungry helium habit aren't necessary -- you can do this with a benchtop system easy.


That big circle? Wines from one vineyard in particular are quite inexpensive and multiple years were available. They clustered really well together. Proof of the terroir myth, LOL?

So -- the big question -- what is it about wines from that one place that are different than the others? To find this I've got to do one of them volcano plot thingies with the wines compared.

I strengthened my cutoffs to narrow the list way down -- yeah -- I'm not screening 200 compounds -- but I have a few huge outliers...and a few are quite informative....but this ends up being one of my favorites.


Wow -- that one is kind of an ugly looking peak -- and a lot of the samples are virtually zero so you can't see a good comparison -- but I'm still gonna leave it here. Check out the numbers, though! We're looking at something that is upregulated like 200 fold over my control bottles!

If you've got high resolution MS/MS fragmentation mzCloud does a good job of identifying things. It is pretty strict, though. Low mass fragment ions are wobblier than you'd think without a lower mass negative calibrant than SDS. I took a significant hit by not adding an additional lower mass calibration ion...but ChemSpider had no problem making the ID of this massively upregulated molecular species.


It is called 3-Methyl-4-octanolide -- but we generally call it "whiskey lactone" -- cause it is a big part of the taste of whiskey. Long story short -- it is significantly higher in some oaks than others. In American oak it is super strong compared to other oaks in the world.

Now -- this may have absolutely nothing to do with why I don't like wine from that one vineyard -- but, of all the other vineyards there -- as far as I have been told in follow up emails -- only one uses American oak in their barrels....guess which one?  It is likely just a funny coincidence, but it makes a good story.

I wasn't going out to really solve this -- I wanted to learn the techniques and learn the software and it's funny to me that I used a couple weekends and some cutting edge technology to tell what I think is an interesting story. I actually started writing this up to publish, but then I got lazy.  The important part is that --- I got some cells from culture prepped from my friends I mentioned earlier and the data for our ASMS poster (its on the last day, if you want to see something I'm actually putting time into) convinced them that I could be trusted with the REALLY cool stuff.


Tuesday, March 7, 2017

Super phosphotyrosine enrichment!

As I continue to backlog some blog posts I was working on -- what about this awesome new paper and completely new strategy for pulling down phosphotyrosine peptides?!?!?  (Big shoutout to Saddiq for tipping me off to it!)


What is it? A completely different way to pull down peptides with phosphotyrosines on them. SH2 domains of proteins specifically bind to phosphotyrosines. This group figured out if they took a protein and modified the SH2 domains they'd end up with proteins that'll bind P-Tyr with super strength!

How's it work? Better than any enrichment I've ever seen!

Proof? Orbi Velos files are at Proteome Exchange (PXD003563) here!

Monday, March 6, 2017

Is there still stuff to discover in the red blood cell proteome?!?!


This new open access paper is low on color, but high on perspective. It seems we got to a certain point and then assumed we'd kinda conquered the red blood cell proteome.

In this short description of their analyses, they show evidence that we might want to look a little deeper. 


There might still be cool stuff in there!