PowerShell Remoting Project Home

Thursday, December 29, 2005

Working with SQL server in monad using pre-compiled .NET Class Library (1)

As an immunologist, I have to deal with a lot of plasmids. In order to share plasmids with my colleagues, I stored all information of plasmids in an SQL server (express edition) running on a Windows XP desktop.  I wrote a small winform program to manage the information stored the SQL server. I have all the data type and program logic (including SQL query) packaged in NET Classes. So if I could reuse those .NET classes in monad, I can do some routine works in MSH scripts and will not waste time to start from scratch.

What is plasmid anyway? Well, plasmid is a circular DNA molecule found in E.coli.  Biologists often use them for gene cloning and expression. If you still don’t know what I am talking about, google (http://www.google.com/search?q=plasmid) for details.

First, let’s make a DLL ready for MSH.
I opened my winform project; got rid of GUI components; recompiled the project to Class Library (PlasmidView.DLL). The DLL contained classes below:




using System;
using System.Data;
using System.Data.SqlClient;

namespace PlasmidView
{
     public class SQLProvider
     {
          // retuen a System.Data.SqlClient.SqlConnection
private static SqlConnection GetDBConnection()     {
               string ConnectionStr = "server=localhost\\SQLExpress;database=PlasmidDB;Integrated Security=SSPI;Trusted_Connection=Yes";
               try {
                    return new SqlConnection(ConnectionStr);}
               catch (SqlException e)     {
                    string SqlMsg = ConstructSqlErrorMessage(e);}
               return null;}

          public static string ConstructSqlErrorMessage(SqlException e)
          {
               string SqlMsg="";
               for (int i=0; i < e.Errors.Count; i++)     {
                    SqlMsg += "Index #" + i + ";" +
                         "Message: " + e.Errors[i].Message + ";" +
                         "LineNumber: " + e.Errors[i].LineNumber + ";" +
                         "Source: " + e.Errors[i].Source + ";" +
                         "Procedure: " + e.Errors[i].Procedure;}
               return SqlMsg;}

          // return a “status” object represent database statitcs
          public static Status GetDBStatus () {
               Status StatusToGet= null;
               SqlConnection myConnection= GetDBConnection();
               try {
                    SqlCommand myCommand = new SqlCommand("dbo.DBStatus", myConnection);
// run a stored procedure “dbo.DBStatus”
                    myCommand.CommandType = CommandType.StoredProcedure;

                    myConnection.Open();
                    SqlDataReader reader = myCommand.ExecuteReader();
                    if (reader.Read()){                    
                         StatusToGet = new Status();
                         StatusToGet.TotalPlasmidNum          =     reader.GetInt32(0);
                         StatusToGet.TotalBackboneNum     =     reader.GetInt32(1);
                         StatusToGet.LastPlasmidID          =     reader.GetInt32(2);
                         StatusToGet.TotalEnzymeNum          =     reader.GetInt32(3);
                         StatusToGet.TotalUserNum          =     reader.GetInt32(4);}
               }
               catch (SqlException e){
                    string SqlMsg = ConstructSqlErrorMessage(e);}
               finally     {
                    myConnection.Close();}
               return StatusToGet;
          }

     }
     public class Status
     {
          int totalPlasmidNum          = 0;
          int totalBackboneNum          = 0;
          int lastPlasmidID          = 0;
          int totalEnzymeNum          = 0;
          int totalUserNum          = 0;

          public int TotalPlasmidNum {
               get {return totalPlasmidNum;}
               set {totalPlasmidNum=value;}
          }

          public int TotalBackboneNum {
               get {return totalBackboneNum;}
               set {totalBackboneNum=value;}
          }

          public int LastPlasmidID {
               get {return lastPlasmidID;}
               set {lastPlasmidID=value;}
          }

          public int TotalEnzymeNum {
               get {return totalEnzymeNum;}
               set {totalEnzymeNum=value;}
          }

          public int TotalUserNum     {
               get {return totalUserNum;}
               set {totalUserNum=value;}
          }
          public Status()     {}
}



Second, let’s load the PlasmidView.DLL to MSH

[void][System.Reflection.Assembly]::LoadFile("D:\MSH\PlasmidView.dll")


Third, Let’s create a instance of  “Status” object:

$Status = new-object PlasmidView.Status
$Status |gm

    TypeName: PlasmidView.Status

Name                 MemberType Definition
----                 ---------- ----------
Equals               Method     System.Boolean Equals(Object obj)
get_LastPlasmidID    Method     System.Int32 get_LastPlasmidID()
get_TotalBackboneNum Method     System.Int32 get_TotalBackboneNum()
get_TotalEnzymeNum   Method     System.Int32 get_TotalEnzymeNum()
get_TotalPlasmidNum  Method     System.Int32 get_TotalPlasmidNum()
get_TotalUserNum     Method     System.Int32 get_TotalUserNum()
GetHashCode          Method     System.Int32 GetHashCode()
GetType              Method     System.Type GetType()
set_LastPlasmidID    Method     System.Void set_LastPlasmidID(Int32 value)
set_TotalBackboneNum Method     System.Void set_TotalBackboneNum(Int32 value)
set_TotalEnzymeNum   Method     System.Void set_TotalEnzymeNum(Int32 value)
set_TotalPlasmidNum  Method     System.Void set_TotalPlasmidNum(Int32 value)
set_TotalUserNum     Method     System.Void set_TotalUserNum(Int32 value)
ToString             Method     System.String ToString()
LastPlasmidID        Property   System.Int32 LastPlasmidID {get;set;}
TotalBackboneNum     Property   System.Int32 TotalBackboneNum {get;set;}
TotalEnzymeNum       Property   System.Int32 TotalEnzymeNum {get;set;}
TotalPlasmidNum      Property   System.Int32 TotalPlasmidNum {get;set;}
TotalUserNum         Property   System.Int32 TotalUserNum {get;set;}

Now, let’s try the GetDBStatus () method

[PlasmidView.SQLProvider]::GetDBStatus()


TotalPlasmidNum  : 21
TotalBackboneNum : 8
LastPlasmidID    : 20
TotalEnzymeNum   : 103
TotalUserNum     : 8


There two things I want to emphases:

1. The original design requires a username and password for database access.   I used Windows NT authentication for SQL server connection, and PlasmidView.SQLProvider class expose all SQL server operation as “static” methods, so the security check has been bypassed.  To regain access control over different user, I have to change C# code to check user credential every time before a SQL server operation.

2. There is really no big difference when using a .Net class in C# or monad. What you need to do is to load the DLL and create the object by new-object cmdlet. Because monad automatically convert “public” class properties to String, quoting a object under MSH prompt will get formatted list of object public properties.  

Tags:    



Wednesday, December 28, 2005

NCBI Blastn under MSH command line

From NCBI web site:
The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.

Blastn is one of my most frequently used tools from NCBI. In stead of going to their web site, I can now run blastn under command line. It is convenient and even faster than web interfaces. Although I use "nr" database here, you can change the URL string to use other database or other blast program.

As for scripting, nothing magic here. Just rewrite original perl script using .NET class: System.Net.WebClient . For Regular Expression, we can use "-match" expression.
Thanks for Lee Holmes Blog

# Begin of script
# =============================================================
# This code is for test purposes only. Use it at your own risk.
# Please do not submit or retrieve more than one request every
# two seconds. Results will be kept at NCBI for 24 hours. For
# best batch performance,they recommend that you submit requests
# after 2000 EST (0100 GMT) and retrieve results before 0500 EST
# (1000 GMT).
# reference: http://www.ncbi.nlm.nih.gov/blast/Doc/urlapi.html
# reference: http://www.ncbi.nlm.nih.gov/blast/docs/web_blast.pl
# =============================================================
# return codes:
# 0 – success
# 1 - invalid arguments
# 2 - no hits found
# 3 - rid expired
# 4 - search failed
# 5 - unknown error
# =========================================================
param([string] $query)

if (-not $query)
{
"Please specify query seqence!"
return 1
}

#Submit query sequence
"================================================================="
"Query sequence:"
""
$query
""
"Submit query sequence..."
$uri="http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?CMD=Put&PROGRAM=blastn&DATABASE=nr&QUERY=" + $query
$BlastClient = new-object System.Net.WebClient
$pagecontent = $BlastClient.DownloadString($uri);
"================================================================="
# Get RID
$pagecontent -match " RID = (.*)"
$RID=$Matches[1]
"RID=" + $RID
" "
$pagecontent -match " RTOE = (.*)"
$TimeToComplete= $Matches[1]
"Time to complete search: " + $TimeToComplete + " seconds or sooner. Waiting..."
" "
Start-sleep $TimeToComplete

#Waiting for results
While ($true)
{
Start-sleep 5
$uri= "http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?CMD=Get&FORMAT_OBJECT=SearchInfo&RID=" + $RID

$pagecontent = $BlastClient.DownloadString($uri);

if ($pagecontent -match "Status=WAITING")
{
"Results not ready, waiting..."
"================================================================="
continue
}

if ($pagecontent -match "Status=FAILED")
{
"Search failed, exiting..."
return 2
}

if ($pagecontent -match "Status=UNKNOWN")
{
"Search expired, exiting..."
return 3
}

if ($pagecontent -match "Status=READY")
{
if ($pagecontent -match "ThereAreHits=yes")
{
"Search complete, retrieving results...";
"================================================================="
break
}
else
{
"No hits found.\n";
return 4
}
}
return 5
}

# Get results
$uri= "http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?CMD=Get&RID=" + $RID + "&ALIGNMENTS=500&ALIGNMENT_VIEW=QueryAnchored&FORMATOBJECT=Alignment&FORMAT_TYPE=TEXT"
$pagecontent = $BlastClient.DownloadString($uri);
$pagecontent >.\blast.results
Get-content .\blast.results | more
return 0
#End of script

So if we do this under MSH prompt:
. blastn-nr.msh AAAAAAAAGGGGGGCCCCCTTTT


the output would like:

==============================================================
Query sequence: AAAAAAAAGGGGGGCCCCCTTTT
Submit query sequence...
RID=1135712055-18130-135523741431.BLASTQ1
Time to complete search: 12 seconds or sooner. Waiting...
Results not ready, waiting...
Search complete, retrieving results...
=============================================================

BLASTN 2.2.12 [Aug-07-2005]Reference: Altschul, Stephen F., Thomas L. Madden,
Alejandro A. Schffer,Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman
(1997), "Gapped BLAST and PSI-BLAST: a new generation ofprotein database search
programs", Nucleic Acids Res. 25:3389-3402.

RID: 1135712055-18130-135523741431.BLASTQ1

Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS,GSS,environmental
samples or phase 0, 1 or 2 HTGS sequences) 3,659,359 sequences; 16,326,075,880
total letters

Query= (23 letters)

Score ESequences producing significant alignments: (Bits) Valued
AK202294.1 Mus musculus cDNA, clone:Y1G0140P22, strand:m... 40.1 0.067gb
AF467007.1AF467007S1 Homo sapiens histidine N-methyltrans... 38.2 0.27
dbjAK192209.1 Mus musculus cDNA, clone:Y1G0108D05, strand:m... 38.2 0.27
embAJ333598.1HSA333598 Homo sapiens genomic sequence surrou... 38.2 0.27
embAJ343149.1HSA343149 Homo sapiens genomic sequence surrou... 38.2 0.27
embAJ343845.1HSA343845 Homo sapiens genomic sequence surrou... 38.2 0.27
gbAY811874.1 Schistosoma japonicum SJCHGC01957 protein mRNA, p 36.2 1.0
dbjAB232923.1 Oryzias latipes hox gene cluster, complete cds, 36.2 1.0
dbjBS000120.2 Pan troglodytes chromosome 22 clone:RP43-007D... 36.2 1.0
......

>gbAY811874.1 Schistosoma japonicum SJCHGC01957 protein mRNA, partial cds
Length=718
Score = 36.2 bits (18),
Expect = 1.0
Identities = 18/18 (100%),
Gaps = 0/18 (0%)
Strand=Plus/Minus
Query 2 AAAAAAAGGGGGGCCCCC 19
Sbjct 156 AAAAAAAGGGGGGCCCCC 139
......

Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS,
GSS,environmentalsamples or phase 0, 1 or 2 HTGS sequences)
Posted date: Dec 26, 2005 4:16 AM
Number of letters in database: -853,793,300
Number of sequences in database: 3,659,359
Lambda K H 1.37 0.711 1.31
GappedLambda K H 1.37 0.711 1.31
Matrix: blastn matrix:1 -3
Gap Penalties:
Existence: 5,
Extension: 2
Number of Sequences: 3659359
Number of Hits to DB: 2403072
Number of extensions: 36138
Number of successful extensions: 36138
Number of sequences better than 10: 21
Number of HSP's better than 10 without gapping: 21
Number of HSP's gapped: 36138
Number of HSP's successfully gapped: 21
Number of extra gapped extensions for HSPs above 10: 36070
Length of query: 23
Length of database: 16326075880
Length adjustment: 18
Effective length of query: 5
Effective length of database: 16260207418
Effective search space: 81301037090
Effective search space used: 81301037090
A: 0X1: 11 (21.8 bits)
X2: 15 (29.7 bits)
X3: 25 (49.6 bits)
S1: 11 (22.3 bits)
S2: 17 (34.2 bits)
====================================================
Have fun!

[Edit: Monad has now been renamed to Windows PowerShell. This script or discussion may require slight adjustments before it applies directly to newer builds.]

Tags:       



Monday, December 19, 2005

Play with ACL in MSH

I have a laptop running Windows XP home edition. As you known, to force you pay more money for professional edition, Mico$oft disabled the "Security" tab in file or directory "Properties" dialogue window. To set ACL, I have to use cacls.exe (come with windows). If you want more functionality, use setacl.exe (under GPL, you can download from sourceforge.org)

It is OK, if you just do adjustments for a few files. It becomes a tedious job, if you have lot of files or directories to modify. So let's give MSH a try.

There are two cmd-let designed for this job:
get-acl : Gets the access control list (ACL) associated with a file or object.
get-acl [[-Path] System.String[]] [[-Filter] System.String] [[-Include]
system.String[]] [[-Exclude] System.String[]] [[-Audit] [System.Boolean]]

set-acl: Set the security Access Control List for an item or items.
set-acl [-ACLObject] aclobject [-Path path]
[-Include include] [-Exclude exclude] [-Filter filter] [-Passthru]

But there is a trick here. In order to set group or set owner, you need an instance of [system.security.principal.ntaccount] object in hand.

#######################################
# Let's get acl for file text.txt
$acl=get-acl text.txt
$acl format-list

#You will get something like
#Path : FileSystem::D:\text.txt
#Owner : Computer\me
#Group : Computer\None
#Access : BUILTIN\Administrators Allow FullControl
# Computer\me Allow FullControl
#Audit :
#Sddl : Bla…Bla…Bla…

#So we can manipulate this acl object now. Let's try to change group to
# BUILTIN\Administrators.
#Get a [system.security.principal.ntaccount] object
$Account = new-object system.security.principal.ntaccount("Administrators ")

#To check whether the group is valid
$SID = $Account.translate([system.security.principal.securityidentifier])
$SID

#You will see
#BinaryLength AccountDomainSid Value
#------------ ---------------- -----
# 16 S-1-5-32-544
#If you see some error message here, you $Account is invalid.

#Use setgroup method of acl object
$acl.setgroup($Account)
$acl format-list

#You will get something like
#Path : FileSystem::D:\text.txt
#Owner : Computer\me
#Group : BUILTIN\Administrators (We made change here!!!!!!!!!!!!!!!)
#Access : BUILTIN\Administrators Allow FullControl
# Computer\me Allow FullControl
#Audit :
#Sddl : Bla…Bla…Bla…

#But this ACL object is in memory, we need to apply them to file
set-acl -aclobject $acl -path D:\text.txt

#make sure you have both -aclobject and -path, otherwise you will get some error.
###############################################

This scheme can be easily changed to modify directory acl or grant access to any user.
You can use get-member cmd-let to explore other methods or property of $acl. I will leave those excise to readers.

Reference
http://mow001.blogspot.com/2005/10/getting-and-using-securityprincipal.html

[Edit: Monad has now been renamed to Windows PowerShell. This script or discussion may require slight adjustments before it applies directly to newer builds.]

Tags:       



Friday, December 16, 2005

Find Amino Acid mutation

Yesterday, we had a quiz on Mass Spectrometry. There is one question Find Amino Acid mutation:
There is a single amino acid mutation on SOD protein from ALS patient. After degradation by Lys-C proteinase and MALDI-TOF analysis, there is a molecular ion at 1689Da. But in wild type SOD protein, it should be 1765Da. That is to say the single amino acid mutation cause reduce of peptide molacular weight of 66Da. What exactly is that mutation.

In a another word: there are 20 amino acid,the difference of molecular weight is 66Da, what 2 amino acid is that. This is a simple combination. In the DOS world, I proberbly use QBasic. But now we can try MSH:


#Name of Amino acid
$AAName = @("Ala";"Arg";"Asn";"Asp";"Cys";"Glu";"Gln";"Gly";"His";"Ile";"Leu";
"Lys";"Met";"Phe";"Pro";"Ser";"Thr";"Trp";"Tyr";"Val")

#Molecular weight of amino acid
$AAMw = @(71.08;156.19;114.10;115.09;103.15;129.12;128.13;57.05;137.14;113.16;
113.16;128.17;131.20;147.18;97.12;87.08;101.11;186.21;163.18;99.13)

#Substract one verse another
for ($i=0;$i -lt 20;$i++) {
for ($j=$i; $j -lt 20;$j++) {
if ($AAMw[$i] -ge $AAMw[$j]) {
$change= $AAMw[$i] - $AAMw[$j]
}
else {
$change= $AAMw[$j] - $AAMw[$i]
}
if ($change -gt 66 -and $change -lt 67) {
if ($AAMw[$i] -ge $AAMw[$j]) {
$AAName[$i] + "(" + $AAMw[$i].Tostring() + ")" + "->" + $AAName[$j] + "("
+ $AAMw[$j].ToString() + ")" + ":" + $change.ToString()
}
else {
$AAName[$j] + "(" + $AAMw[$j].Tostring() + ")" + "->" + $AAName[$i] + "("
+ $AAMw[$i].ToString() + ")" + ":" + $change.ToString()
}

}
}
}



Results:
His(137.14)->Ala(71.08):66.06
Tyr(163.18)->Pro(97.12):66.06

Tags:    



Friday, December 09, 2005

My First MSH Script


#===========================================================
#This script will check colinux service, start a colinux console (NT).
#This script comes with NO warrenty! Use at your own risk!
#===========================================================

"Cheching Colinux Service...."
$CoService = get-service Where-object {$_.ServiceName -like "*colinux*"}
if (!$CoService) { throw "Colinux Service Not Installed ! "}
"Found Colinux Service..."
$CoService
if ( $coservice.status.ToString() -ne "Running"){
"Colinux Service is NOT running ! Try to start Colinux Service..."
$coservice.Start()
}
"Colinux Service Is Running ! Get Colinux-daemon Process..."
$CoProcess = get-process Where-object {$_.ProcessName -like "*colinux-daemon*"}
if (!$CoProcess){ throw "Colinux-daemon Is Not Running ! " }
"Colinux-daemon Is Running ... "
$CoProcess
"Try to start Colinux console..."
$CoConsoleProcess = get-process Where-object {$_.ProcessName -like "*colinux-console*"}
$CoPID = [string] $CoProcess.Id
if (!$CoConsoleProcess){
D:\colinux\colinux-console-nt -a $CoPID
}
else{
"Colinux-console (NT) already running !
"Quiting ..."
}

Tags:    



Friday, December 02, 2005

Ready for MSH

MSH, code name "Monad" , is next generation of microsoft command line tools.

Get MSH
http://www.microsoft.com/technet/scriptcenter/topics/msh/download.mspx

Information for monad
http://www.reskit.net/monad/

Monad team blog
http://blogs.msdn.com/monad/

Tags: