User Tutorial
If you have an account on an existing UI machine running the latest EMI software then you can try out this tutorial there. If not, you can install the client tools on your own UI as follows:
Configure EMI repository (see http://emisoft.web.cern.ch/emisoft/index.html).
Install the EMI client package using the command:
$ yum install emi-ui
Get credentials: you must have a grid certificate from your national Certification Authority or TERENA TCS in order to use this tutorial. Your credentials should be in jks or pkcs12 format for Unicore, and in pem format for ARC, dCache and gLite. You can convert between these formats as follows:
Create private key pem file from pkcs12 file:
$ openssl pkcs12 -nocerts -in cert.p12 -out userkey.pemEnter Import Password: (insert your certificate password)MAC verified OKEnter PEM pass phrase: (insert your Enter PEM pass phrase - this will be your pass phrase to access your key in future)Verifying - Enter PEM pass phrase: (reinsert your Enter PEM pass phrase) Create user certificate pem file from pkcs12 file:
$ openssl pkcs12 -clcerts -nokeys -in cert.p12 -out usercert.pemEnter Import Password: (insert your certificate password)MAC verified OK Create pkcs12 file from private key and user certificate pem files, and the CA certificate file (you can usually download this from your Certificate Authorities webpage):
$ openssl pkcs12 -export -out cert.p12 -inkey userkey.pem -in usercert.pem -certfile CAcert.crtEnter pass phrase for userkey.pem: (insert your PEM pass phrase)Enter Export Password: (insert your certificate password)Verifying - Enter Export Password: (reinsert your certificate password) For more information on installing the EMI UI or other EMI components please see the document Generic Installation & Configuration for EMI 1.
Authentication
UNICORE
Preparing preferences file
The example of the preferences file can be found in the <UCC_HOME>/conf directory (<UCC_HOME> is a directory where the UNICORE Commandline Client was unpacked). In the file there should be specified: the keystore (in jks or pkcs format), the registry URL and the password (optional). A keystore is a file which contains user's certificate and certificates of the trusted CA. Trusted certificates may be also indicated as separate jks files.
storetype=<jks or pkcs12>keystore=<user keystore>password=<user password>registry=<user registry URL>#optional: configure separate truststore (must be JKS)truststore=<user truststore>truststorePassword=<user password> By default UCC checks the existence of the file in the <user_home>/.ucc/ directory, so the preferences file should be copied to such folder.
Connecting to the Grid
To connect the Grid user should run the connect command. To get help ucc command with -h option should be used:
$ ucc connect$ ucc -h The user's password can be written in the preferences file. Otherwise user will be asked for it on every call. To avoid that, an interactive mode can be run by writing ucc shell. In that mode user do not have to write ucc before commands. UCC Shell offers also the commands expansion.
ARC
You need a certificate and a key file in order to use the Grid. The default path for them is
~/.globus/usercert.pem and ~/.globus/userkey.pem. (It is required to set the rights of the key file to 400 to exclusively allow the owner to read the file.) If you want to put them to a different place, you have to specify the path in the ~/.arc/client.conf file, like this:keypath=/home/user/.cert/userkey.pemcertificatepath=/home/user/.cert/usercert.pem Your key file is protected with a passphrase. It is inconvenient to type this passphrase every time you issue a command; moreover, other Grid services acting on your behalf do not know the password. In order to work on this Grid, you have to create a public proxy certificate, which has limited life time and is not protected by password. To create the proxy, you should use the arcproxy command, like this:
$ arcproxyYour identity: /DC=***/O=***/CN=***Enter pass phrase for /home/user/.cert/userkey.pem:...++++++.++++++Proxy generation succeededYour proxy is valid until: 2011-05-24 03:08:35 The validity time of the proxy certificate is 12 hours by default, which can be overwritten for example with 1 hour by using the
arcproxy --constraint=validityPeriod=1H orarcproxy -c validityPeriod=1H in a shorter form. The
arcproxy --info command can show the current proxy's validity and location. If you belong to a Virtual Organisation (VO), arcproxy can also create a proxy with the special VOMS extension which certifies that you are indeed a member of this VO and can access its resources:
$ arcproxy --voms atlasYour identity: /DC=***/O=***/CN=***Enter pass phrase for /home/user/.cert/userkey.pem:.......................................................................++++++.................++++++Contacting VOMS server (named atlas): voms.cern.ch on port: 15001Proxy generation succeededYour proxy is valid until: 2011-05-24 03:08:35 In order to use this feature, enter your VO contact string in the file
~/.voms/vomses (ask your VO managers for details). (Please refer to the ARC UI Manual for further options of arcproxy!)
gLite
The gLite UI tools use the same certificate and key file as ARC. By default these are stored in a .globus directory in your home directory.
keypath=/home/user/.globus/userkey.pemcertificatepath=/home/user/.globus/usercert.pem Create your voms proxy
$ voms-proxy-init --voms and check it's valid
$ voms-proxy-info -allsubject : /C=***/O=***/OU=Personal Certificate/L=***/CN=***/CN=proxyissuer : /C=***/O=***/OU=Personal Certificate/L=***/CN=***identity : /C=***/O=***/OU=Personal Certificate/L=***/CN=***type : proxystrength : 1024 bitspath : /tmp/x509up_u539timeleft : 11:59:51key usage : Digital Signature, Key Encipherment, Data Encipherment=== VO testers.eu-emi.eu extension information ===VO : testers.eu-emi.eusubject : /C=***/O=***/OU=Personal Certificate/L=***/CN=***issuer : /C=***/O=***/OU=***/L=***/CN=***attribute : /***/Role=NULL/Capability=NULLtimeleft : 11:59:51uri : *** As gLite and ARC use the same proxy file you can usually use a proxy created with arcproxy to run gLite jobs and vice-versa.
Browsing resources
UNICORE
To check the name of available Target Systems and storages and to list applications user can use list-sites, list-storages and list-applications commands:
$ ucc list-sites$ ucc list-storages$ ucc list-applicationsARC
The
arcinfo [cluster ...] command prints the information about the available resources. With the --long argument it will print more information. You can specifiy the URLs of the clusters as arguments of the command. You must specify default services (that will define entry point to the Grid) or aliases (that can be used instead of the long form of individual site URLs) in you~/.arc/client.conf file like this:[common]defaultservices=index:ARC0:ldap://index1.nordugrid.org:2135/Mds-Vo-name=NorduGrid,o=gridindex:ARC0:ldap://index2.nordugrid.org:2135/Mds-Vo-name=NorduGrid,o=gridindex:ARC0:ldap://index3.nordugrid.org:2135/Mds-Vo-name=NorduGrid,o=gridindex:ARC0:ldap://index4.nordugrid.org:2135/Mds-Vo-name=NorduGrid,o=gridcomputing:ARC0:ldap://ce1.grid.upjs.sk:2135/Mds-Vo-name=local,o=Gridcomputing:ARC1:https://pgs03.grid.upjs.sk:50000/arex...[alias]arc0=computing:ARC0:ldap://grid.tsl.uu.se:2135/nordugrid-cluster-name=grid.tsl.uu.se,Mds-Vo-name=local,o=grid (Please refer to the ARC UI Manual for the format of the configuration file!)
gLite
Through the command lcg-infosites we can gather the available resources for our VO. We see first which Computing Elements are available
$ lcg-infosites --vo <VO># CPU Free Total Jobs Running Waiting ComputingElement---------------------------------------------------------------- 12 12 0 0 0 cert-09.cnaf.infn.it:8443/cream-lsf-demo 0 0 0 0 0 cream-37.pd.infn.it:8443/cream-lsf-cert 0 0 0 0 0 cream-37.pd.infn.it:8443/cream-lsf-creamtest1 0 0 2 0 2 cream-37.pd.infn.it:8443/cream-lsf-creamtest2 8 8 0 0 0 lxbra2308.cern.ch:8443/cream-pbs-testersemi Now we query the information system to know which Storage Elements are available
$ lcg-infosites --vo <VO> se Avail Space(kB) Used Space(kB) Type SE------------------------------------------ 7908181 1010947 SRM cork.desy.de 101168616 6153137 SRM lxbra1910.cern.ch 99630252 7691501 SRM lxbra2502.cern.ch 10511159 215773 SRM lxbra2506v1.cern.chSubmission of jobs
UNICORE
Job description
UNICORE Commandline Client uses job description JSON format which allows users to specify an application or executable they want to run, arguments, environment settings and files to transfer. The example job descrirption (presented below) can be copied to the
date.u file.# simple job: run Date{ ApplicationName: Date, ApplicationVersion: 1.0,}Running job
To run the job one should use ucc run command
$ ucc run date.u -v In this case the standard out went for example to
1bc1bb08-7737-4fb1-854e-5d89ba18d7f0.stdout. The option -v turns on verbose mode. There can be also used -b option which gives short output file names (without the hash of the job). The option -a run the job in an asynchronous way: the input files are staged-in and the job is submitted but the results can be downloaded later using the command get-output. To get the status of the specific job command get-status may be used. As an argument one can either use the job file that he got from run -a command or the End Point Reference (EPR) obtained from list-jobs:
$ ucc run -a date.u -v -b$ ucc list-jobs$ ucc get-status job$ ucc get-output jobRunning job on a set of files
To run UNICORE on a set of files user can put jobs descriptions in one directory (e.g.
indir/) and use batch command: batch. -i argument indicate source directory (with .u scripts), -o - directory for output files:$ ucc batch -i indir -o outdirARC
The arcsub commands provides features to communicate with the information systems, do brokering, translate the job descriptions, move input files and submit jobs to the clusters. When your defaultservices are properly configured, arcsub will automatically select a best Grid site for you. If for some strange reason you don't want to use Grid, but prefer to submit jobs to a specific site, you can use
-c arc0 argument can be use the specify this site (here arc0 is the alias described in the configuration example, but can be a site's IP address or hostname). Native ARC job description is written in XRSL format (JSDL is also possible, as well as gLite JDL). Here is a simple XRSL job description:
$ cat myjob.xrsl&(executable="/bin/echo")(arguments="Hello World")(stdout="hello.txt") Submit this job to the Grid simply as
$ arcsub myjob.xrsl The command will print a long URL, which is the ID of the job - this must be used later to query the job's information, get the results, kill the job, renew its proxy if it expires wile job was running, and do other operations.
The arcstat command shows the status of the job (You should replace JOBID with the ID of your job.)
$ arcstat JOBID The arccat command prints the standard output or error of the job:
$ arccat JOBID The arcget command downloads the results of a finished job and remove the job from the grid:
$ arcget JOBID The
--all argument would do the following commands with all the active jobs:$ arcstat --all$ arccat --all$ arcget --all You can store selected job IDs in a file and use it as input as well:
$ arcstat -i myjobs.txt A very useful command is
arcsync : when you move to a different computer, you can syncronise the list of your jobs on the Grid by simply typing:$ arcsyncgLite
Job submission request are expressed via JDL (Job Description Language). Find below a very simple but usable example, which just runs "
uname -a" on the executing node$ cat uname.jdlType = "Job";JobType = "normal";Executable = "/bin/uname";StdOutput = "uname.out";StdError = "uname.err";OutputSandbox = {"uname.out","uname.err"};Arguments = "-a";requirements = other.GlueCEStateStatus == "Production";rank = -other.GlueCEStateEstimatedResponseTime;RetryCount = 0; We now submit the job to the Workload Management System (WMS) which will find a suitable resource on which our job can run:
$ glite-wms-job-submit -a uname.jdlConnecting to the service https://lxbra2303.cern.ch:7443/glite_wms_wmproxy_server====================== glite-wms-job-submit Success ======================The job has been successfully submitted to the WMProxyYour job identifier is:https://lxbra2303.cern.ch:9000/F0KY_m0DBH5wpXzLt59q5A========================================================================== On success, the submission command returns a job identifier, that we eventually use to monitor job status and, once it's done, we use the job identifier to retrieve the output
$ glite-wms-job-status https://lxbra2303.cern.ch:9000/F0KY_m0DBH5wpXzLt59q5A======================= glite-wms-job-status Success =====================BOOKKEEPING INFORMATION:Status info for the Job : https://lxbra2303.cern.ch:9000/F0KY_m0DBH5wpXzLt59q5ACurrent Status: Done (Success)Logged Reason(s): - job completed - Job Terminated SuccessfullyExit code: 0Status Reason: Job Terminated SuccessfullyDestination: lxbra2308.cern.ch:8443/cream-pbs-testersemiSubmitted: Sat Jul 9 13:32:09 2011 CEST==========================================================================$ glite-wms-job-output https://lxbra2303.cern.ch:9000/F0KY_m0DBH5wpXzLt59q5AConnecting to the service https://lxbra2303.cern.ch:7443/glite_wms_wmproxy_server================================================================================ JOB GET OUTPUT OUTCOMEOutput sandbox files for the job:https://lxbra2303.cern.ch:9000/F0KY_m0DBH5wpXzLt59q5Ahave been successfully retrieved and stored in the directory:/tmp/jobOutput/budapest40_F0KY_m0DBH5wpXzLt59q5A================================================================================$$ ls /tmp/jobOutput/budapest40_F0KY_m0DBH5wpXzLt59q5A/uname.err uname.out$ ls -l /tmp/jobOutput/budapest40_F0KY_m0DBH5wpXzLt59q5A/total 4-rw-r--r-- 1 budapest40 users 0 Jul 9 13:35 uname.err-rw-r--r-- 1 budapest40 users 116 Jul 9 13:35 uname.out$$ cat /tmp/jobOutput/budapest40_F0KY_m0DBH5wpXzLt59q5A/uname.outLinux lxbra2506v6.cern.ch 2.6.18-238.12.1.el5xen #1 SMP Tue May 31 13:35:45 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux$Submitting Multiple Jobs
There are several ways to submit multiple jobs to a gLite Grid. The simplest is to submit separate jobs as a collection of jobs.
$ glite-wms-job-submit -a --collection jdls/Connecting to the service https://lxbra2303.cern.ch:7443/glite_wms_wmproxy_server====================== glite-wms-job-submit Success ======================The job has been successfully submitted to the WMProxyYour job identifier is:https://lxbra2303.cern.ch:9000/y7eIsv-bHDpuNjE8v2Y-yw========================================================================== You can then view the status and get the output of all jobs via the single job id which is returned by glite-wms-job-submit.
Data management
UNICORE
Import/Export files
To import files form the local computer to the job directory (or to export files from the job directory to the local computer ) user should indicate the source and the target files in the .u script:
{ Imports: [ { From: "/path/fileName", To: "remoteFileName" }, ] Exports: [ { From: "remoteFileName", To:"/path/localFileName" }, ] } Storage
To import files from UNICORE storage (or to export files to UNICORE storage) in the job description file the storage address should have a format:
u6://TargetSystemName/StorageName/fileName:{ Imports: [ { From: "u6://TargetSystemName/StorageName/fileName", To: "remoteFileName" }, ] Exports: [ { From: "remoteFileName", To:"u6://TargetSystemName/StorageName/fileName" }, ] } The file can be also uploaded into the storage and downloaded from the storage using ucc put-file and get-file commands. ls allows user to list files in the directory:
$ ucc put-file -s fileName -t u6://TargetSystemName/StorageName/fileName$ ucc get-file -s u6://TargetSystemName/StorageName/fileName -t newFileName$ ucc ls u6://TargetSystemName/StorageName/fileNameResources
In the Resources section of the .u script user can specify resources to run the job on the remote system. The section may look as follows:
Resources: { Memory: 128000000, Nodes: 1, CPUs: 8 , }Example
In this example the localScript.sh file is a script which writes the word "Hello" to the file named "newFile":
echo "Hello" >> newFile User can create
localScript.sh file on the local computer. To upload it to the storage one should first check the Target System name (in this example: EMI-UNICOREX) and available storages (in this example: Home). After uploading the file, the storage can be listed:$ ucc list-sites$ ucc list-storages$ ucc put-file -s localScript.sh -t u6://EMI-UNICOREX/Home/storageScript.sh$ ucc ls u6://EMI-UNICOREX/Home In the example, the script in the storage was named
storageScript.sh, so such file should be placed on the list created by the ucc ls command. The
bash.u file is a script which imports the storageScript.sh from the Home storage to the job directory and renames it to the remoteScript.sh. The script is run creating the newFile. After that thenewFile is downloaded to the user's local computer and renamed to the localNewFile. It is also exported to the UNICORE Home storage.{ ApplicationName: "Bash shell", Environment: [ "SOURCE=remoteScript.sh", ], Imports: [ { From: "u6://EMI-UNICOREX/Home/storageScript.sh", To: "remoteScript.sh"} ], Exports: [ { From: "newFile", To: "localNewFile"}, { From:"newFile", To: "u6://EMI-UNICOREX/Home/storageNewFile"} ], Resources: { CPUs: 1 , }} The
bash.u job can be run and the files in the storage can be listed. In the user's local computer there should be created localNewFile file.$ ucc run bash.u -v$ ucc ls u6://EMI-UNICOREX/HomeARC
If your job needs input data, or produces output data, you do not need to copy these files by hand, as ARC will take care of all data movement. You only have to specify inputfiles and outputfiles in job description (see XRSL manual for details).
In addition, ARC provides command line tools for basic work with any storage elements: list, copy and remove files. Most common protocols are supported by ARC: gsiftp, http, ftp, as well as meta-protocols like srm, lfc, rls (see XRSL manual for details).
To list files in an SRM storage (e.g. dCache), do:
$ arcls srm://srm.myplace.org To copy files use:
$ arccp http://www.mystuff.org/file1 gsiftp://se.myplace.org/file1 To remove files, use:
$ arcrm gsiftp://se.myplace.org/file1 Any combination of supported protocols can be used; authorisation on Grid storages is performed on the basis of your Grid proxy.
An interesting functionality of arcls or arccp is that they can be used even to check files created by your Grid jobs: you can use arcls to list the current working directory of the job or you can use arccp to copy a temporary result file from the execution site to your local machine, even while the job is running:
$ arcls JOBID$ arccp JOBID/filename localname This is however not recommended; especially avoid using arcrm on your job, unless you really know what you are doing.
gLite
Create a local file, and then store it on an available SE:
$ echo "This a sample file" > example.txt$ cat example.txtThis a sample file$ lcg-cr -d lxbra1910.cern.ch file:$PWD/example.txt GSIFTP: default set up URL modeGSIFTP: dest: set up FTP mode. DCAU disabled. Streams = 1, Tcp BS = 0guid:e2edabff-3fa7-4853-b44a-9cab256befdb The file has been stored on the SE
lxbra1910.cern.ch and the lcg-cr command returns a Grid Unique Identifier (guid) for our file. Our file has also automatically registered in the File Catalog and assigned a Logical File Name (lfn). With the option -l we could specify an lfn for the file. The File Catalog provides an easier way to identify and browse our files using these Logical File Names. To see our file we can use the File Catalog command lfc-ls command to list all files, in this case we will limit it to files created today.
$ lfc-ls /grid/$MYVO/generated/2011-07-11file-99018d3a-138c-4344-82a0-48a2ad10c27b Note that the identifier returned here is the lfn not the guid. If we want to see the guid we can use the lcg-lg command
$ lcg-lg lfn:/grid/$MYVO/generated/2011-07-08/file-99018d3a-138c-4344-82a0-48a2ad10c27bguid:e2edabff-3fa7-4853-b44a-9cab256befdb To copy the file from the SE to the UI we can use the lcg-cp command as follows
$ lcg-cp guid:e2edabff-3fa7-4853-b44a-9cab256befdb file.txt or we could use the lfn
$ lcg-cp lfn:/grid/$MYVO/generated/2011-07-08/file-99018d3a-138c-4344-82a0-48a2ad10c27b file.txt Of course, the lfn is more useful if you set it to a sensible value when creating our file with the -l option. Try creating another file with a logical file name containing your user id, for example
lfn:/grid/$MYVO/generated/2011-07-11/USERNAMEXX.txt. We can now delete the registered file using the GUID; if we check for file existence after deletion, we obviously don't find it.
$ lcg-del -a guid:e2edabff-3fa7-4853-b44a-9cab256befdb$$ lfc-ls /grid/$MYVO/generated/2011-07-11dCache
dCache is another EMI product which manages access to disk and tape storage. We will look briefly at some of the file access methods supported by dCache.
SRM
browsing files:
$ srmls -2 srm://sligo.desy.de:8443/pnfs/desy.de/data/testers.eu-emi.eu/ writing file to SE:
$ srmcp -2 file://////etc/group srm://sligo.desy.de:8443/pnfs/desy.de/data/testers.eu-emi.eu/group_DDMMYY_[A-Za-z]$ srmls -2 srm://sligo.desy.de:8443/pnfs/desy.de/data/testers.eu-emi.eu/ writing file back from SE
$ srmcp -2 srm://sligo.desy.de:8443/pnfs/desy.de/data/testers.eu-emi.eu/group_DDMMYY_[A-Za-z] file://///tmp/groups_080711A.back deleting a file:
$ srmrm -2 srm://sligo.desy.de:8443/pnfs/desy.de/data/testers.eu-emi.eu/group_DDMMYY_[A-Za-z]dCap
writing file to SE
$ dccp /etc/group dcap://xen-ep-emi-tb-se-3.desy.de:22125/pnfs/desy.de/data/testers.eu-emi.eu/group_DDMMYY_[A-Za-z]$srmls -2 srm://xen-ep-emi-tb-se-3.desy.de:8443/pnfs/desy.de/data/testers.eu-emi.eu/ writing file back from SE
$ dccp dcap://xen-ep-emi-tb-se-3.desy.de:22125/pnfs/desy.de/data/testers.eu-emi.eu/group_DDMMYY_[A-Za-z] /tmp/group_DDMMYY_[A-Za-z].backwebDAV
Browse files from command line:
$ cadaver http://sligo.desy.de:2880> ls> bye GUI clients: nautilus, firefox add-on TrailMix (now proprietary), OS-based file browsers that support webDAV.
More information by Oleg and Tanja http://trac.dcache.org/projects/dcache/wiki/WebDAV%20Hands%20on
Write files:
$ curl -v -T /etc/group http://sligo.desy.de:2880/pnfs/desy.de/data/testers.eu-emi.eu/testFileCURL_DDMMYY_[A-Za-z] Look for the file through srmls or cadaver.