Friday, January 16, 2015

Sorting out the identities of the supposedly 'cya' samples

There's some confusion (an euphemism for my lab-notebook errors?) about the identities of the strains that are annotated as cya-.  Scott reported at lab meeting that the sequence reads appeared to have intact cya genes.

Part of the confusion may be because the cya (or crp) knockouts I might have used are not deletion mutants (they're not part of the set of deletion mutants constructed by Sunita).  Instead they're old mutants (> 20 years old) that were created by random insertion of the transposon construct miniTn10Kan, which has a kanamycin cassette (from Tn903) inserted between the IS10 ends of transposon Tn10.  Alignment of sequence reads from such a mutant will show that all the bases of the gene are present and transcribed, but that the reads that include part of the insertion will appear to all begin or end at the same position.

The other problem is genuine confusion about which strain(s) I actually used for these cultures.  So I’ve checked my lab notes, which are also not entirely consistent.

The master_sample_key table lists these samples as strain RR688.  Since RR688 is a clinical strain sent by Simon Kroll, this is probably a typo for strain RR668 (= cya::miniTn10kan). 

For Day F, my notes say that I used a culture of strain RR540 (= crp::miniTn10kan) which I had previously used in Day E.  I think this agrees with what Scott found in the RNA-seq results, that the supposedly cya knockout strains have intact cya genes but could have disrupted crp genes.

My notes for Day E are a bit confused.  They seem to say that the RR540 and RR668 strains didn’t grow and that I instead used a double knockout strain (RR3006) that was cya::Cm and crp::miniTn10kan.  I might have done this, but I suspect my notes are wrong.

Luckily, the sequence reads are the ultimate authority.  So I've eamiled Scott asking him to confirm that the reads from both sets of ‘cya’ cultures (Days E and F) have disrupted crp genes and intact cya and sxy genes.  Because this is a miniTn10kan insertion, all the crp sequences should be present, but the read alignments that span this sequence (CAAAATACATCACG*TCAAGTCACGAAT) should all end at the *.  (I can’t give the genome sequence position number - this location information is taken from a pre-genome paper.)

Later:  Scott, here's the KW20 genome location I get with a BLAST search for this sequence (reassuringly, this is indeed part of the crp gene):
Query  1        CAAAATACATCACGTCAAGTCACGAAT  27
                |||||||||||||||||||||||||||
Sbjct  1015177  CAAAATACATCACGTCAAGTCACGAAT  1015203  


In anticipation that this is what he'll find, I've changed the strain names and descriptions for these samples in the master_sample_key table.

No comments: