{ "cells": [ { "cell_type": "markdown", "id": "eada1107-1957-4868-b9a2-326ce0663b43", "metadata": {}, "source": [ "# Accessing files in different storage locations" ] }, { "cell_type": "markdown", "id": "f27d9e7e-5f1b-4630-843a-3025741b690c", "metadata": {}, "source": [ "In this demo, we demonstrate how data can be accessed from different storage spaces available to a Purdue Analysis Facility user. Copies of the same ROOT file have been placed in different storage locations, in this example they are accessed via `uproot.open()`." ] }, { "cell_type": "code", "execution_count": 1, "id": "76288510-3239-4606-89b2-47156171644c", "metadata": { "tags": [] }, "outputs": [], "source": [ "import uproot\n", "import time\n", "\n", "def test_file_access(path, filename):\n", " '''\n", " This function opens a NanoAOD file at a given path\n", " and prints the pT of the first muon found in that file,\n", " as well as the time taken to access the file.\n", " '''\n", "\n", " print(f\"Accessing file at {path}{filename}\")\n", " start = time.time()\n", " with uproot.open(path+filename)[\"Events\"] as file:\n", " pt = file[\"Muon_pt\"].array()[0][0]\n", " print(\"First muon pT: \", round(pt, 2), \"GeV\")\n", "\n", " dt = round(time.time()-start, 2)\n", " print(f\"Time elapsed: {dt}s.\")\n", " print()\n" ] }, { "cell_type": "markdown", "id": "865c3159-8061-4123-b80f-c5c3945f300f", "metadata": { "tags": [] }, "source": [ "### 0. Local files" ] }, { "cell_type": "code", "execution_count": 2, "id": "c187c93a-8479-4f77-a1ea-5c5306b79108", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accessing file at ./data/test_nanoaod.root\n", "First muon pT: 82.55 GeV\n", "Time elapsed: 3.91s.\n", "\n" ] } ], "source": [ "test_file_access(\"./data/\", \"test_nanoaod.root\")" ] }, { "cell_type": "markdown", "id": "050e19bc-955e-4a75-a4e4-c92e81a0f09c", "metadata": { "tags": [] }, "source": [ "### 1. Purdue Depot\n", "The Purdue Depot storage is writeable for Purdue users, but read-only for external users." ] }, { "cell_type": "code", "execution_count": 3, "id": "a1d9210a-18fd-4312-b7b1-38d6f6d1fe06", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accessing file at /depot/cms/hmm/test_nanoaod_depot.root\n", "First muon pT: 82.55 GeV\n", "Time elapsed: 0.87s.\n", "\n", "Accessing file at ~/depot/hmm/test_nanoaod_depot.root\n", "First muon pT: 82.55 GeV\n", "Time elapsed: 0.71s.\n", "\n" ] } ], "source": [ "# Choice of two paths - the mount point and the symlink in home directory\n", "# (the latter is convenient for access via the file browser).\n", "\n", "filename = \"test_nanoaod_depot.root\"\n", "test_file_access(\"/depot/cms/hmm/\", filename)\n", "test_file_access(\"~/depot/hmm/\", filename)" ] }, { "cell_type": "markdown", "id": "7b3e85b1-65b3-490e-93cb-5e7d8be5cb12", "metadata": { "tags": [] }, "source": [ "### 2. Purdue EOS\n", "Purdue EOS storage is mounted as read-only FS; however, Purdue users can write to their EOS directories using `gfal` commands [(see instrictions here)](https://www.physics.purdue.edu/Tier2/user-info/tutorials/dfs_commands.php)." ] }, { "cell_type": "code", "execution_count": 4, "id": "ac050d04-e2ad-431f-a216-1ff943fa92a0", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accessing file at /eos/purdue/store/user/dkondrat/test_nanoaod_eos_purdue.root\n", "First muon pT: 82.55 GeV\n", "Time elapsed: 0.82s.\n", "\n", "Accessing file at ~/eos-purdue/store/user/dkondrat/test_nanoaod_eos_purdue.root\n", "First muon pT: 82.55 GeV\n", "Time elapsed: 0.79s.\n", "\n" ] } ], "source": [ "# Choice of two paths - the mount point and the symlink in home directory\n", "# (the latter is convenient for access via the file browser).\n", "\n", "filename = \"test_nanoaod_eos_purdue.root\"\n", "test_file_access(\"/eos/purdue/store/user/dkondrat/\", filename)\n", "test_file_access(\"~/eos-purdue/store/user/dkondrat/\", filename)" ] }, { "cell_type": "markdown", "id": "6d7241ef-7e41-45c0-8b21-c4c53f70a8e4", "metadata": { "tags": [] }, "source": [ "### 3. CERN EOS (CERNBox)\n", "\n", "
\n", "Warning\n", "\n", "The following cell must be modified, since each user only gets access to their own CERNBox directory.\n", "To enable access to your CERNBox, run command `eos-connect` in terminal.\n", "
" ] }, { "cell_type": "code", "execution_count": 5, "id": "66e36f07-1881-4201-b767-93c73d41760a", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accessing file at /eos/cern/home-d/dkondrat/test_nanoaod_eos_cern.root\n", "First muon pT: 82.55 GeV\n", "Time elapsed: 5.31s.\n", "\n", "Accessing file at ~/eos-cern/test_nanoaod_eos_cern.root\n", "First muon pT: 82.55 GeV\n", "Time elapsed: 0.77s.\n", "\n" ] } ], "source": [ "# Choice of two paths - the mount point and the symlink in home directory\n", "# (the latter is convenient for access via the file browser).\n", "\n", "filename = \"test_nanoaod_eos_cern.root\"\n", "test_file_access(\"/eos/cern/home-d/dkondrat/\", filename)\n", "test_file_access(\"~/eos-cern/\", filename)" ] }, { "cell_type": "markdown", "id": "1ec8737d-d9c4-4179-9328-781a789a4c7a", "metadata": { "tags": [] }, "source": [ "### 4. XRootD\n", "Before using XRootD, initialize the VOMS proxy in terminal (`voms-proxy-init --voms cms`)" ] }, { "cell_type": "code", "execution_count": 6, "id": "5eca0546-a33b-44e9-9e62-315b3c5d4d7e", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accessing file at root://eos.cms.rcac.purdue.edu//store/user/dkondrat/test_nanoaod_eos_purdue.root\n", "First muon pT: 82.55 GeV\n", "Time elapsed: 2.03s.\n", "\n" ] } ], "source": [ "test_file_access(\"root://eos.cms.rcac.purdue.edu//store/user/dkondrat/\", \"test_nanoaod_eos_purdue.root\")" ] }, { "cell_type": "markdown", "id": "b5c7c550-906b-479f-baa6-c7bd1312293d", "metadata": { "tags": [] }, "source": [ "### 5. CVMFS" ] }, { "cell_type": "code", "execution_count": 7, "id": "86e1c2f2-cd73-46aa-8b46-1fe1a643cab1", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/cvmfs/cms.cern.ch/cmsset_default.sh\n" ] } ], "source": [ "!ls /cvmfs/cms.cern.ch/cmsset_default.sh" ] }, { "cell_type": "code", "execution_count": null, "id": "2a98d156-382f-494a-a07c-ee3daf3706ff", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python3 kernel (default)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.10" } }, "nbformat": 4, "nbformat_minor": 5 }