Formulae¶

This tools is designed to allow you to combine multiple columns using simple or complex formulae. See Usage for pratical informations and Syntax to get a description of the formulae’s syntax.
Usage¶
The window is divided in two parts:
Available columns: on the left lies a table of the metadata columns used to create aliases for metadata columns titles,
Mappings: on the right, you can create new columns by combining existing ones.
- Creating a new formula will then need two steps:
To be used in a formula, a column need an alias (that need to be a valid Python identifier). You just need to double-click inside a cell of the Alias column to define an alias for the corresponding column. Validate by hitting Enter.
Last step is to add a new column by clicking on the
, set it’s name (a column with this name will be overwritten if it already exists) and define a formula thats includes constants and/or alias set in step before.
Toolbar¶
The toolbar located on top right of the dialog includes a few buttons:
Add new formula: Add a new empty line in the Mappings table. It’s up to you to fill Name and Formula cells,
Remove selected formulae: Remove the selected rows on the Mappings table.
Add Function: A drop-down sectional list of available functions. A function is used by adding a comma-separated list of arguments in parentheses after it’s name, e.g.
mean(a,b)
.
Add Constant: A list of available constants. A constant is just a replacement for
Syntax¶
The syntax used to describe formulae is a subset of Python programming language. As MetGem use the Pandas library internally, the following operations are supported:
Arithmetic operations except for the left shift (
<<
) and right shift (>>
) operators, e.g.,x + 2 * pi / y ** 4 % 42 - pi
Comparison operations, including chained comparisons, e.g.,
2 < x < y
Boolean operations, e.g.,
x < y
andx < y
ornot column1
List and tuple literals, e.g.,
[1, 2]
or(1, 2)
Math functions:
sum
,mean
,median
,prod
,std
,var
,quantile
,min
,max
,sin
,cos
,exp
,log
,expm1
,log1p
,sqrt
,sinh
,cosh
,tanh
,arcsin
,arccos
,arctan
,arccosh
,arcsinh
,arctanh
,abs
,arctan2
andlog10
Constants:
pi
,e
This Python syntax is not allowed:
- Expressions:
Function calls other than math functions.
is/is not operations
if expressions
lambda expressions
list/set/dict comprehensions
Literal dict and set expressions
yield expressions
Generator expressions
Boolean expressions consisting of only scalar values
Attribute access, e.g.,
df.a
Subscript expressions, e.g.,
df[0]
- Statements
Neither simple nor compound statements are allowed. This includes things like for, while, and if.
Note
A Python identifier is a name used to identify a variable, function, class, module or other object. An identifier starts with a letter A to Z or a to z or an underscore (_) followed by zero or more letters, underscores and digits (0 to 9).
Python does not allow punctuation characters such as @, $, and % within identifiers. Python is a case sensitive programming language. Thus, Manpower and manpower are two different identifiers in Python.