Formulae¶
This tools is designed to allow you to combine multiple columns using simple or complex formulae. See Usage for pratical informations and Syntax to get a description of the formulae’s syntax.
Usage¶
The window is divided in two parts:
Available columns: on the left lies a table of the metadata columns used to create aliases for metadata columns titles,
Mappings: on the right, you can create new columns by combining existing ones.
- Creating a new formula will then need two steps:
To be used in a formula, a column need an alias (that need to be a valid Python identifier). You just need to double-click inside a cell of the Alias column to define an alias for the corresponding column. Validate by hitting Enter.
Last step is to add a new column by clicking on the
, set it’s name (a column with this name will be overwritten if it already exists) and define a formula thats includes constants and/or alias set in step before.
Toolbar¶
The toolbar located on top right of the dialog includes a few buttons:
Add new formula: Add a new empty line in the Mappings table. It’s up to you to fill Name and Formula cells,
Remove selected formulae: Remove the selected rows on the Mappings table.
Add Function: A drop-down sectional list of available functions. A function is used by adding a comma-separated list of arguments in parentheses after it’s name, e.g.
mean(a,b).
Add Constant: A list of available constants. A constant is just a replacement for
Syntax¶
The syntax used to describe formulae is a subset of Python programming language. As MetGem use the Pandas library internally, the following operations are supported:
Arithmetic operations except for the left shift (
<<) and right shift (>>) operators, e.g.,x + 2 * pi / y ** 4 % 42 - piComparison operations, including chained comparisons, e.g.,
2 < x < yBoolean operations, e.g.,
x < yandx < yornot column1List and tuple literals, e.g.,
[1, 2]or(1, 2)Math functions:
sum,mean,median,prod,std,var,quantile,min,max,sin,cos,exp,log,expm1,log1p,sqrt,sinh,cosh,tanh,arcsin,arccos,arctan,arccosh,arcsinh,arctanh,abs,arctan2andlog10Constants:
pi,e
This Python syntax is not allowed:
- Expressions:
Function calls other than math functions.
is/is not operations
if expressions
lambda expressions
list/set/dict comprehensions
Literal dict and set expressions
yield expressions
Generator expressions
Boolean expressions consisting of only scalar values
Attribute access, e.g.,
df.aSubscript expressions, e.g.,
df[0]
- Statements
Neither simple nor compound statements are allowed. This includes things like for, while, and if.
Note
A Python identifier is a name used to identify a variable, function, class, module or other object. An identifier starts with a letter A to Z or a to z or an underscore (_) followed by zero or more letters, underscores and digits (0 to 9).
Python does not allow punctuation characters such as @, $, and % within identifiers. Python is a case sensitive programming language. Thus, Manpower and manpower are two different identifiers in Python.
Remove selected formulae: Remove the selected rows on the Mappings table.
Add Function: A drop-down sectional list of available functions. A function is used by adding a comma-separated list of arguments in parentheses after it’s name, e.g.
Add Constant: A list of available constants. A constant is just a replacement for