Tips & Tricks: Curating Creatively¶
Because fw-heudiconv
is built in Python, you have access to anything Python
can do when you build your heuristic (as long as you use the special functions
and data structures). Here, we show a few fun ways we’ve used Python to solve
a few tricky heuristic challenges.
Dynamically Replacing Subject/Session Labels¶
It might be useful to dynamically replace a Flywheel subject’s label with some
other label in BIDS — for example, in the event that you need to withhold personally
identifying information from a BIDS output you share, but still keep the original
Flywheel subject’s label, for consistency. Well this can be accomplished in the
Replace*()
functions using a DataFrame with pandas
. If you’re running
fw-heudiconv
from disk, you can read in a file at the same time that the
heuristic is parsed:
def ReplaceSubject(label):
import pandas as pd
df = pd.read_csv('DeIdentifiedNames.csv')
And then filter your DataFrame as necessary:
def ReplaceSubject(label):
import pandas as pd
df = pd.read_csv('DeIdentifiedNames.csv')
target = df[(df.first_name == "Jason")]
replacement = target['new_ID'].values[0]
return str(replacement)
DataFrames to Strings¶
In order to use the AttachTo*()
function, your data needs to be converted into a string. To attach a data-frame object, use the following steps:
def AttachToSession():
# example: uploading multiple files -- a json, and a TSV
import json
adict = {
"id": "04",
"name": "foo",
"scan": "blah"
}
json_object = json.dumps(adict, indent = 4)
attachment1 = {
'name': 'jsonexample.json',
'data': json_object,
'type': 'application/json'
}
import pandas as pd
raw_data = {'first_name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'],
'last_name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze'],
'age': [42, 52, 36, 24, 73],
'preTestScore': [4, 24, 31, 2, 3],
'postTestScore': [25, 94, 57, 62, 70]}
df = pd.DataFrame(raw_data, columns = ['first_name', 'last_name', 'age', 'preTestScore', 'postTestScore'])
attachment2 = {
'name': '{subject}/{session}/perf/{subject}_{session}_aslcontext.tsv',
'data': df.to_csv(index=False, sep='\t'), # .to_csv() with no file argument returns a string!
'type': 'text/tab-separated-values'
}
# this is also an opportunity to demonstrate how to attach multiple files -- just use a list!
return [attachment1, attachment2]
Arterial Spin Labelling Data¶
ASL is a BIDS protocol proposal that is fast on its way to being accepted into the official BIDS spec, but is still being reviewed and updated. At present, ASL in BIDS requires a special kind of events file, the aslcontext file. This is a TSV file not unlike the events.tsv file given for BOLD task data, but is used in this case to denote the order of label vs. control in the volumes. The file might look like this:
For this purpose, we can use the AttachToSession()
function. You could do as
above and read in a file on disk within the function, but you could be even
cleverer and instead dynamically create this file:
def AttachToSession():
NUM_VOLUMES=10
data = ['control', 'label'] * NUM_VOLUMES
data = '\n'.join(data)
data = 'volume_type\n' + data # the data is now a string; perfect!
output_file = {
'name': '{subject}_{session}_aslcontext.tsv',
'data': data,
'type': 'text/tab-separated-values'
}
return output_file
This could be especially useful if you don’t want to rely on external data files to curate your project.
You can find out the correct number of LABEL-CONTROL pairs from the DICOM header info found in the output of fw-heudiconv-tabulate
,
which will also help you hard code the extra ASL metadata and insert it into the MetadataExtras
variable.