Wednesday, December 7, 2016

Great...SUMO in malaria parasites....

Mostly leaving this article here so I don't forget about it!

SUMO or Sumoylation is a weird set of PTM(?) people are finding everywhere. Unlike their human counterparts, SUMO proteins are very small. They covalently attach to proteins to modify their function. Some studies consider them analogous to ubiquitin regulation, though this might honestly be for reference framework rather than a true reflection of their mechanisms.

And...this new study...shows SUMO in Plasmodium falciparum.

Which, further PubMed searching shows me, has been known for a long time, just not to me. The importance of this study is it helps narrow down where we might find these modifications and when.

As a side note -- looking for "Sumo Pugs" found me the great image above, which is part of an original set of 3 paintings called "Sumo Pugs."

In case my exceptionally understanding, beautiful and patient wife reads this far when this post hits her Feedly -- They were surprisingly inexpensive (more supply than demand somehow?!?!) and I promise to put them in my office or something... ;)

Tuesday, December 6, 2016

CanProVar -- Another great resource for cancer mutation FASTAS!

How useful is a FASTA that doesn't have your peptide sequence in it? A lot less useful than one that does!  Sure, we can de novo or Byonic it, but try streaming NBA games when your resource monitor has looked something like this since Friday....

(It actually doesn't look that bad right now. but I did have to end some runs and move Byonic from "Heavy" CPU usage to "Normal" which is a nice feature, and fine for streaming as long as I don't go above 720p)

What was I talking about? Oh yeah!

It is a whole lot easier when the FASTA contains your peptide sequence!  And if you're studying something like cancer, chances are the normal database resources don't have what you're looking for.

You'll find a couple of enthusiastic posts on this dumb blog about the XMAn cancer mutation database that Virginia Tech researchers assemble and maintain (Go Hokies!)

CanProVar is another sweet resource out of Vanderbilt that is built from public repositories and dbSNP files (whatever those are).

Which one should you use?

A first comparison between the files is that CanProVar is smaller than XMAn, which would vastly improve the search time on big datasets. The downside, of course, is that the database is...smaller.... which probably means fewer mutations are present.

You can check out CanProVar here.

Apparently, it predates other resources by quite a bit, cause here is a paper on it from 2010. Somehow, I've either never seen it before -- or I've forgotten it. And this is exactly why Twitter is awesome for science!

Monday, December 5, 2016

Urine metaproteomics for rapid clinical UTI diagnoses?

Traditionally, clinical microbiology diagnostics has been 1) labor intensive 2) Slow 3) Error prone 4) Slow and 5) Slow.

A lot of groups are working on it, and I think most people at ASMS 2016 saw Dr. Pandey's overview of the work Johns Hopkins is doing to overhaul the field with rapid LC-MS/MS microbial identifications that massively exceed the diagnostic powers of MALDI-TOF based ID.

Lots of techniques are going to be necessary to improve clinical micro -- because there are a LOT of clinically important pathogens -- and body fluids.

This brand new multi-lab study in my local area is a great, reasonably simple and obviously effective approach that could be applied in the clinic to rapid urine diagnoses.

The study is somewhat of a follow-up to a paper from some of these authors last year and since I'm reading them back to back, I'm probably going to kind of lump them together.

Despite the massive dynamic range problems (we seem to see in all body fluids) and the presence of the normal microbiota, these authors demonstrate a surprising level of diagnostic power from their workflow. 12 healthy humans vs 12 patients with UTIs...and they find both diagnostic markers to separate the 2 groups -- as well as the power to identify individual pathogenic species.

The workflow is as shown in the picture above, so I won't go into it. Something that stood out as interesting to me is the database they started out with.

They used a tool called CD-HIT to generate their FASTA database and this tool seems really interesting. You can direct link to it here and this is a somewhat recent paper detailing the tool.

I'll have to investigate this tool a bit later.

If you are interested in clinical proteomics for diagnostics, or just urine in general, I'd definitely look up these papers!

EDIT after a short dog walk: One of the cool findings in the newest study, which didn't really hit me until we were some ways from the office, is that the urine sediment pellet has a lot of diagnostic power -- and if you only look at the soluble proteins you might miss what you need for accurate diagnosis.

Sunday, December 4, 2016

Let's examine the changes aspirin causes in the lysine acetylome!?!?

The hardest part of my Sunday morning so far has been selecting from the incredibly strange images that Google Images suggests for "aspirin side effects."


Because I have an increasingly odd sense of humor?

No!  Well...partially, but also because of this new paper in press at MCP!

If I didn't make it past the introduction of this paper this would have still been worth my time, because I never had any idea how aspirin worked until this morning. In fact, I never knew that I wanted to know how aspirin worked. And if I had been asked, I would never have guessed that it would cause global level shifts in our acetylomes!

To study this, they made some heavy labeled aspirin (they tell you how to do it, even) and they treated some HeLa cells with it. Now...HeLa cells might seem like a funny pick for a pathway that we are going to extrapolate might work this way in normal biology, but with a methodology this good/interesting and well laid out, you could easily check to see if these observations reproduce in a more biologically relevant model!  (Worth noting, this group is aware of the hypertriploid nature of this cell line AND used this in the standardization of the protein abundance. Which is pretty awesome!)

After dosing the cells with the heavy aspirin (and DMSO controls, of course) they digest out the peptides with LysC and do some regular proteomics. They take some of the peptides and use anti-acetylated lysine agarose beads for the enrichment steps. All the mass spec work is done on a QE Classic.

(They also throw in some SILAC time course, btw! but you'll have to read the paper for more info!)

All the downstream data processing is done with MaxQuant and they come out with some optimistic findings. While aspirin, in vitro, is going to acetylate the crap out of everything and look like its going to totally mess up all protein function -- in an in vivo model the effects don't seem nearly so destructive.

TL/DR: Great paper showing 1) how aspirin works and 2) How to do a really thorough lysine acetylomics study!

Saturday, December 3, 2016

Open Tubular LC -- a universal separation solution?

I've never ever heard about this before. Never. GoogleScholar has, however, and points me toward a review from 1984 for more information...

The difference being that they were doing higher flowrates in the 80s, than they demonstrate in this cool new paper!

A reason you might want to check this out? bout they separate some metabolites using this technique -- wait, is that a lipid? I dunno -- and some peptides -- and some intact proteins -- and they DON'T change the setup!?!?  All this stuff just elutes correctly?

It is a weird universal separation technique with stupid levels of sensitivity. They detect 25 attograms of a compound with a Q Exactive Classic. Didn't know what was below "femto?" It appears to be "atto"!

Friday, December 2, 2016

Differential glyco and membrane enrichment in a KRAS mutant model!

Full disclosure, I'm not totally unbiased on this one cause I contributed to it but the cool part is the upstream discovery workflow (and I only helped with validation).

The paper is in press at Oncotarget right now, but the pre-release version is available here.  In a nutshell, this paper comes out of the NCI KRas initiative, which hopes to better understand how and why KRas mutant cancers are so hard to treat.

They start with a cancer cell line that has normal KRas and they mess it up by transfecting it with a mutant version of the protein. (They also transfected some of the cells with a vector that didn't have the mutated protein in it as control cause, they're classical scientists like that).

The team then split the sample: 1 part got enriched for surface glycopeptides with a chemical that can't get into intact cells (reference here) and the other part was analyzed using a membrane enrichment strategy they developed themselves a while back that I've really only ever seen done at their facility, but absolutely works.

In the end, they come up with around 500 proteins and/or glycosylation sites that appear to change on the surface just cause the cell is now making some of that mutant KRas thing...suggesting why so many scientists have been studying this protein -- what it does is important.

For the downstream work they use my 2 favorite validation techniques ever! Some super gnarly immunohistochemistry with a  ridiculoulsy nice microscope (check this out!) imaging technology has been improving...cause that is some sick resolution on the cell surface....

And I helped them check some stuff they were most curious about with PRMs using some heavy peptide standards they had synthesized. NanoLC-PRMs, FTW! Even on an Orbitrap Elite, which maybe doesn't have the best architecture for PRMs, you can get some ridiculous sensitivity and (of course!) totally linear quan!

Thursday, December 1, 2016

Need the perfect standard for intact MS and/or top-down? Check this out!

Shoutout to Bob for tipping me off to this one!

Is this the perfect mixture for optimizing your intact protein mass measurement and/or top-down system? Some of the smartest people I know who do this seem to thinks so, cause they either contributed to this new App note or are testing it!  

Wednesday, November 30, 2016

Thermo MSF Parser!

Wow! Am I missing more stuff all the time? Starting to feel like an NBA referee. (The Will Smith video above is hilarious, btw).

Proof? Check out this cool paper from like 5 years ago!

It is about this cool tool that you can use to do downstream analysis of PD 1.3 and 1.4 results. They hint that they are working on new versions as well.

So...all that stuff I've written on here about using the free SQLite tools...well...yeah...that still works, too....

Tuesday, November 29, 2016

A new genomics based scored protein-protein resource!

Several groups are pounding away systematically working up protein-protein interactions -- at the protein level. This new paper in Nature Methods...

thinks that if we mine the existing data and use all the proteins that have phylogenetic relationships with one another we can get to the answer of who interacts with whom.

The results are impressive. And it is worth noting that even though they didn't even cite the BioPlex resource, 57% of the data points they incorporated came from direct experiments with human proteins.

BTW, InWeb_IM is their resource in the Venn Diagram above, so even if they aren't right about all of them, it is a whole lot more protein-protein interactions for us to look through!

You can access InWeb_IM's nice web interface directly here.

You might not want to directly Google it, cause the site it took me to isn't the one in the paper -- and looks a little scammy!

Monday, November 28, 2016

CSF-PR 2.0 -- Now with more cerebrospinal fluid PTMs!

The CSF-PR resource has been around for a couple of years, but needed a revamp know...something like 90% of the proteomic data EVER generated has shown up in the last 4 years or so. (Do I have that pie chart on here? I LOVE that pie chart!)

For details on the updates check out this new one from MCP here!

There are some other databases out there on CSF. This one stands out cause of the following criteria:

Has to be LC-MS/MS from living humans
Has to contain at least 20 patients per study
The data from the study has to be publicly accessible AND in some way be open to quantification between 2 disease groups or between 1 disease group and 1 control group with an n greater than 3 in both groups.

Unsurprisingly, considering how relatively few people happily line up for CSF withdrawals, they whittle a relatively large number of published studies down to a much smaller set of super high quality (and medically interesting!) studies.

You can directly access this resource here!

Sunday, November 27, 2016

FDR calculations applied to Orbitrap Metabolomics data!

Not more metabolomics...geez...

Yes, I know, this belongs somewhere else, but I promise it is really super cool. (Link to paper here!)

From our perspective, it probably seems pretty straight-forward, right? If you've got MS/MS data that you are saying is this small molecule, maybe you'd want to do some sort of a false discovery measurement, right?!?  And...maybe if you've jumped head-first into doing metabolomics cause it's super easy interesting, you might be put off a little at first cause you don't have FDR measurements.

Turns out it isn't quite so easy with the small molecules thanks to how they don't fragment as friendly as peptides do, and we can't just move down the line to the next peptide sequence that is truly unique -- since there isn't a second peptide. You get one shot at identifying and quantifying

This paper introduces two ideas -- JumpM and MISSILE that are a little incongruous, but together assembles a full methodology for how they think metabolomics should be done with heavy standards, Orbitrap data and target-decoy based FDR. is honestly way smarter than the way I do it....

Saturday, November 26, 2016

Glycan analysis of protists! And other cool new Springer Protocols

Honestly, just leaving this here to remind myself to grab this new book at the library! But, how awesome does this new protocols book look?!?

Okay...actually...another new Springer Protocols book just rolled out as well and it should be on my "to borrow" list so I can close this tab on my browser.  Check this out!

Want a taste of the sweet stuff that is in this one? Check this protocol out!

Three browser tabs closed in one post? You're...welcome...?

Friday, November 25, 2016

Crowd-fund some proteomics?

There are a lot of crowd-funded projects out there these days. Seems like you can find all sorts of stuff. Like...possibly radioactive men's wedding rings that glow really really strongly in the dark..?

(Shoutout to the CarbonFi KickStarter!)

In something far more serious, a new company in New York called HayStack is taking a swing at crowdsourcing for getting started in Clinical Cancer Proteomics. You can check it out directly here!

Now...if this was just some people blowing smoke about their capabilities -- and that of the technology, I probably would have just retweeted this and kept going -- but when some guy named Pappin is one of your company founders, this might just deserve a quick blog post.

And...if it leaves me the chance to foreshadow the story of how I died of finger cancer...well, you know....

Tuesday, November 22, 2016

Polymorphic Peptide Variants and Propagation in Spectral Networks

Subtitle: Why everyone needs to take a whack at proteomics data!

Need a paper to mull on while avoiding discussing politics with your family this holiday weekend? Think on this one!

What is it? Wait, you can't tell from the title? Come on!

In all seriousness, it is a really unique (to me, at least) way of thinking about what that unmatched spectra might be in that organism you don't have a good database for. And it might just be brilliant. I can't tell.

I gave it a good read and then thought about it in my car while I enjoyed the combination of normal D.C. and possibly early holiday traffic(?) and this is what I think is going on.(And I might totally have this wrong).

Imagine we're starting off with this organism that no one has sequenced before and we need to do proteomics on it. The mass spec side is the same as always (as long as it wasn't hard to lyse or whatever, of course) but then we've got no database for it. We could de novo it or use BICEPS, but these are both going to be super computationally expensive, full of false discoveries or require that you spend 2 years studying Python to use it (this approach may fail in one of these regards as well, I'll have to check).

Spectral networks goes sideways here. What if you could lower your bioinformatic load (what?!?) by running more samples? They go the easy route here and take 3 bacteria and do dd-MS2 on them. Then they take the spectra that are the most similar (by MS/MS fragments) and network them together. In this way you can 1) Find the most important features and 2) Start to limit what you're going to have to search.

I know this is wacky. Who has spare mass spec time?!? To this, I answer -- who can find a good bioinformatician for that salary that you can't seem to find a good mass spectrometrist for? Nobody, that's why!

Seriously -- what choice do we you are told to get some proteomics data on this organism? Wait and hope the genomics people are considering it a priority, will sequence it this year, and will annotate it by 2020?

Example set: They start with 3 species (or strains) of Cyanothece that biofuels people are seriously interested in that someone has done proteomics on. Serious proteomics:

Start with:
 >1e6 spectra/organism
Cluster the completely homologous peptides (identical ones from each run AND organism)
 = heck, if you search those conserved ones you're gonna have a massive reduction in search space (but you're going to miss what makes that organism why it isn't the other)
Cluster the MS/MS spectra that are only different in one mass shift. For example, the y ions are awesome till you get to the high mass ones and then each one is off by 8 in species 2 and 14 in species 3. (Or whatever). then move onto the next pairing!

As a side effect here, btw, you're going to get a quick understanding of evolutionary relatedness here -- without any genomic information on these guys! Most these MS/MS spectra are the same and you didn't get the samples mixed up? These things are related for sure!

In this run through they break their spectra into something like 16,000 networks. So....this is just a little more complex than the example 2 paragraphs up, but it is for illustration purposes only.

But check this out -- you now have these networks, where this spectrumA is equal to spectrumB (+8Da at y7/y8/y9) and spectrumC is equal to spectrumB (- something). Now that it is all linked you dump in some matched spectral data. Some stuff that is ID'ed and perfect. The MS/MS spectra are linked to IDs and it falls together like dominoes.

Does it work? They probably wouldn't have sent it to MCP if it didn't, but it definitely looks like it works. I find it makes more sense to me the more I think about it....

The pipeline is more complex than I described.

...but all the tools are freely available here. 

Monday, November 21, 2016

Proteomic analysis of patients with cerebral and uncomplicated malaria

Malaria sucks. I'm reading this book now:

...and it put how bad malaria sucks into perspective. One expert she references throws out a figure that more than half the human beings who have ever lived may have died...of malaria... If you are into a super depressing read on how a gross parasite has shaped our history, mostly by killing us by the millions and billions, I couldn't come up with a better suggestion.

For a more uplifting tail, I suggest this nice recent paper from Bertin et al.,.  In this work, these authors take a proteomic run at some patient sera with non-complicated (bad) and cerebral (really really really bad) Plasmodium falciparum (the species that generally kills you the first go 'round) malaria. They used label free quan and an Orbitrap Velos and some clever bioinformatic tricks (compound databases with lots of the Var sequences) and sweet downstream statistics to try and find some differences.

While there are tons of challenges with this monster of a disease, like crappy databases and poor annotation and mutations all over the place, they are still able to find some really interesting conclusions. Several of the differentially regulated proteins they find appear linked and may even work together in functional complexes.