HI1537 (licA), HI1538 (licB), HI1539 (licC), HI1540 (licD)
has CAAT repeats:
in kw20 (and what seems to be everything other than crpx):
has AGCAG repeats:
in kw20 (and what seems to be everything other than taxx):
It looks like the extra repeat in all the strain other than taxx causes a huge difference.
has AACT repeats:
Note: these repeats go on for more than 100 basepairs. Since we use reads of length 101 I don't think it's possible to detect a deletion here. This is the downside of using short reads.
But this gene is highly upregulated in KW20 in sBHI
versus murE at the same timepoint:
I have a good reason that phase variation is also responsible for this difference.
HI1457 (opa), HI1456 (??)
has no repeats.
This gene pair has been a pain in my side for a while. Mainly because it looks like it is turned on in sBHI only in cells that are hypercompetent:
For strains in BHI, there is like a... 1.6% chance of seeing this assuming each strain has a 50% chance of having the gene on. But in MIV, it looks like crpx (and maybe hfqx) are the only ones that don't really express this gene. There is not a clear view of what's going on in this data.
I dug around some papers and found this one:
The phasevarion: a genetic system controlling coordinated, random switching of expression of multiple genes.
Unfortunately, I am not able to see which samples have active mod because again, the repeat region is greater than 100 bp. The mod gene does seem to be expressed (significantly) more in the crp knockout though. For now, I think I'll treat opa as indirectly phase variable.