0% found this document useful (0 votes)
156 views52 pages

From Excel To KNIME 081921-44

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
156 views52 pages

From Excel To KNIME 081921-44

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

2

Copyright © 2021 by KNIME Press

All rights reserved. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or
transmission in any form or by any means, electronic, mechanical, photocopying, recording or likewise.

This book has been updated for KNIME 4.4.

For information regarding permissions and sales, write to:

KNIME Press
Hardturmstrasse 66
8005 Zurich
Switzerland

[email protected]

3
Table of Contents

General Usage.................................................................................................................................................................................................................................... 6
Spreadsheets | Workflows and Nodes ........................................................................................................................................................................................................ 7
Folders | Workspace .................................................................................................................................................................................................................................... 8
The KNIME Workbench ....................................................................................................................................................................................................................................... 9
Building a KNIME Workflow .............................................................................................................................................................................................................................. 10
Display Data Table ............................................................................................................................................................................................................................................ 11
Input/Output .................................................................................................................................................................................................................................... 12

Opening an Excel File | Excel Reader Node ............................................................................................................................................................................................. 13

Opening a CSV or txt File | CSV Reader Node.......................................................................................................................................................................................... 14


Importing Content from Multiple Files of the same Type to a Single Table ..................................................................................................................................................... 15
Importing Content from Multiple Sheets into a Single Table............................................................................................................................................................................ 16

Saving an Excel File | Excel Writer Node.................................................................................................................................................................................................. 17

Adding a Sheet to an Excel File | Excel Writer Node ................................................................................................................................................................................ 18

Data Types in Excel | Data Types in KNIME ............................................................................................................................................................................................. 19

Connect to a Database | Database Connector Nodes ............................................................................................................................................................................. 20


Tips on Reading Data with KNIME Analytics Platform ..................................................................................................................................................................................... 21
Appending / Joining Data.................................................................................................................................................................................................................. 22

Appending Data | Concatenate Node ...................................................................................................................................................................................................... 23


VLOOKUP | Filter and Joiner Node ............................................................................................................................................................................................................ 24
Filtering, Transformations, and Aggregations .................................................................................................................................................................................... 25
Filtering / Removing Rows with a Specific Value Row Filter ............................................................................................................................................................................ 26

4
Filtering / Removing Rows with Different Values Rule-based Row Filter ......................................................................................................................................................... 27

Removing Duplicates | Duplicate Row Filter ............................................................................................................................................................................................ 28

Sorting Rows by Multiple Key Columns | Sorter Node............................................................................................................................................................................. 29

Removing Columns | Column Filter Node................................................................................................................................................................................................ 30

Reordering and Renaming Columns | Column Resorter and Column Rename Node ............................................................................................................................. 31

Changing Data Types | String to Number and Number to String ............................................................................................................................................................. 32


Tip on Data Manipulation with KNIME Analytics Platform ............................................................................................................................................................................... 33
Data Aggregation .............................................................................................................................................................................................................................................. 34

Pivot Tables | Pivoting Node ................................................................................................................................................................................................................... 35

Pivot Table without Columns | GroupBy Node ........................................................................................................................................................................................ 36


Unpivot | Unpivoting Node ......................................................................................................................................................................................................................... 37
Math Functions and .......................................................................................................................................................................................................................................... 38
Text Functions .................................................................................................................................................................................................................................................. 38

Math Functions | Math Formula Node ..................................................................................................................................................................................................... 38

Math Functions | Math Formula Node ..................................................................................................................................................................................................... 40

Concatenation and Find& Replace | String Manipulation Node............................................................................................................................................................... 41


Formatting Excel Tables ................................................................................................................................................................................................................... 42
Creating an XLS Control Table with Tag Values ............................................................................................................................................................................................... 45
Adding Formatting Actions based on Tag Values ............................................................................................................................................................................................ 49

5
General Usage

6
Spreadsheets Workflows and Nodes

Excel KNIME Analytics Platform

Microsoft Excel is a spreadsheet program, which features calculation, graphing KNIME Analytics Platform implements visual programming. This means that each
tools, pivot tables, and a macro programming language (Visual Basic for data analysis step is represented by means of an icon block, called a node, in a
Applications, VBA for short). graphical editor. Each node can perform one specific task. For example the Excel
Reader node can read one sheet of an Excel file or the Row Filter node allows to
By using cell mathematics, macros, and VBAs you can edit a sheet. This can be filter rows based on a filter criterion.
really easy cell mathematics, like summarizing the values from cells A1 and B1 (=
SUM(A1, B1)), but can be also really complex, embedded logic. A sequence of connected nodes is called a workflow and is the corresponding
concept of an Excel sheet with many functions and/or VBAs.

Data is organized through data tables, where each data cell is identified by a
column header and a Row ID. To visualize the content of a data table, see page 11.

Note. Nodes have four possible states displayed by a little traffic light under the
node itself:
- Not configured -> red light
- Configured -> yellow light
- Successfully executed -> green light
- Executed with error -> red light with cross

For more details about KNIME Analytics Platform, check:


- R. Silipo, “KNIME Beginner’s Luck”, KNIME Press, 2021
- R. Silipo, J. Prinz, “KNIME Advanced Luck”, KNIME Press, 2021

7
Folders Workspace

Excel KNIME Analytics Platform

Excel files are normally saved in different folders. A single Excel file can contain The workspace defines the folder where all workflows, data and intermediate
data are saved. One workflow corresponds to an Excel sheet with all formulas,
multiple sheets. visualizations and VBAs. All the projects and datasets saved in a workspace are
available in the KNIME Explorer, located in the top left corner of the KNIME
workbench. The path to the workspace is selected at the very beginning, after
starting KNIME Analytics Platform.

You can still change the workspace after


KNIME Analytics Platform has been
launched, by going to “File” in the top menu
and selecting “Switch Workspace”. You
can have multiple workspaces, e.g. for
different projects or customers.

8
The KNIME Workbench

After downloading and installing KNIME Analytics Platform you can start it from the desktop or from the installation folder. The KNIME workbench, which you can see
below, opens including the following panels:
“KNIME Explorer” showing the list of currently available workflows and KNIME servers for the selected workspace and the My-KNIME-Hub mountpoint.
“Workflow Coach” recommending the next node based on the KNIME user statistics and the node currently selected in the “Workflow Editor”.
“Node Repository” containing all currently installed nodes. A “Search” box is available at top of this panel to search for nodes.
“Workflow Editor” in the center allowing for the creation and editing of workflows.
“Node Description” showing a text describing the node task and configuration settings, for the selected node either in the “Workflow Editor” or in the “Node Repository”
panel.
“Node Monitor” showing a preview of the output table of the node selected in the “Workflow Editor”
“KNIME Hub” allowing use of the KNIME Hub to search for nodes, workflows, components, and extensions.
“Outline” offering an overview of the workflow
“Console” showing execution messages, e.g. error and warning messages.

Workflow Editor
KNIME
Explorer Node
Description

Workflow
Coach

KNIME Hub

Node
Repository Console / Node
Outline Monitor

9
Building a KNIME Workflow
KNIME workflows are created by dragging&dropping nodes from the “Node Repository” or “Workflow Coach” panel to the “Workflow Editor”. Use the search box on the
top of the Node Repository or browse through the nodes, sorted by different categories to find the correct node for your next step.
Nodes are connected to each other through their input and output ports. Just click the output port of the first node and release at the input port of the second node.
Nodes that have just been created show a red light status: not yet configured. To configure a node, right-click the node and select the option “Configure” or alternatively
double-click the node. The node “Configuration” window opens. Configure the node and close the configuration window. If the configuration is successful, the node status
changes to a yellow traffic light.
The node is now configured, but not yet executed. To execute the node, right-click the node and select the “Execute” option. If the execution is successful, the node
changes its status to a green light.

Step 1: Search and create a node via drag& drop Step 2: Connect the nodes

Step 3: Configure the node Step 4: Execute the node

Note 1: To create a new, empty workflow right click in the KNIME Explorer panel, select “Create New KNIME Workflow…” and define the name and destination of the new
workflow in the new window.
Note 2: Click the magnifier next to the search box in the node repository to change the mode of the search box to a fuzzy search. This makes finding the correct node
easier in the beginning.
Note 3: The “Getting Started Guide” guides you step by step through building your first example workflow.

10
Display Data Table

Excel KNIME Analytics Platform

In Excel what you see is what you get. This means that the data table you see is The output data tables produced after node
the final data table. execution are always available. To see them:
- Right-click the node in the workflow
- Select the last option in the context
menu

Column header Data Type

Note: Some nodes, like plotting and modeling nodes, also have a more complex
“View” function. The option leading to this “View” is usually displayed in the middle
of the context menu.

11
Input/Output

12
Opening an Excel File Excel Reader Node

Excel KNIME Analytics Platform


To open an Excel file you can either double click the file or open Excel, go to “File” The Excel Reader node reads a single sheet of an Excel file.
in the top menu, select “Open…” and then browse to the file you want to open. Similar to Excel you first have to define the file path. Next you can
select the sheet. The additional setting options enable you to
define whether the table has column headers and / or row IDs
and to specify which part of the sheet you want to read.

File path

Sheet
Name

Column headers

Note: To read multiple Excel files which have the same column headers and are
all stored in one folder, select in the Input location part as Mode “Files in folder”.
This reads all Excel files and concatenates them, aka stacks them on top of each
other.

13
Opening a CSV or txt File CSV Reader Node

Excel KNIME Analytics Platform


To open a CSV file click “File” in the top menu and select “Import”. Select CSV file The CSV Reader node reads various text based files, e.g. CSV
and then choose the file you want to open. files. In the configuration window you can set all the necessary
Fill in all necessary options, such as the separating character, to correctly read the options, such as separating character, to correctly read the file.
file.
File path

Delimiter

Column header
Headerssd

Note 1: Click the “Autodetect format” button if the node doesn’t create the preview.
Note 2: Check out the additional tabs to limit the number of rows or to change the
encoding.

14
Importing Content from Multiple Files of the same Type to a Single Table

Excel KNIME Analytics Platform


If you have a folder containing multiple files of the same type (e.g. CSV), you can Many reader nodes in KNIME Analytics Platform support reading multiple files in
open them all at the same time in different Excel instances. Follow the instruction a folder and combining them into one table. For example the CSV Reader can read
from the previous page and select all the files you want to open. multiple text-based files.
To read multiple files select the “Mode” “Files in folder” and specify the folder
To move data tables together into one single sheet, you have to proceed manually using the “Browse” button. A little note tells you how many files are currently
using copy and paste. selected. In case you don’t want to read all the files in a folder, click the button
“Filter options” to specify which files should be included according to their file
Note: Before copying and pasting, ensure that all files have the same column extension and/or name. Lastly you can also include files from subfolders by
order. activating the checkbox “Include subfolders”.

Mode

Filter

Note: In the Transformation tab you can define whether you want to use the union
or the intersection of the columns from the different tables.

Union or intersection

15
Importing Content from Multiple Sheets into a Single Table

Excel KNIME Analytics Platform


To move data from multiple Excel sheets into one Excel sheet you proceed With a simple loop you can read all sheets of an Excel file automatically.
manually, using copy and paste.
The Read Excel Sheet Names node creates a list of all sheet names. The loop (the
Note: Before copying and pasting, ensure that all files have the same column part in between the blue nodes) reads one sheet of the Excel file at each iteration.
order. Therefore, at each iteration the Table Row to Variable Loop Start node creates a
flow variable with the sheet name as its value. This flow variable is used in the
Excel Reader node to control the sheet selection. The Loop End concatenates the
content from the different tables.

Note 1: Lesson 3 of the free KNIME Self-Paced Course L2-DW KNIME Analytics
Platform for Data Wrangles course introduces flow variables.

Note 2: Lesson 4 of the free KNIME Self-Paced Course L2-DW KNIME Analytics
Platform for Data Wrangles introduces loops in KNIME.

16
Saving an Excel File Excel Writer Node

Excel KNIME Analytics Platform


To save the sheet of an Excel file you have different options: The Excel Writer node writes the input data table into a sheet in
• Click File->Save As… and define the output location an Excel file, in either xls or xlsx format.
• Or press Shift+Ctrl+s and define the output location In the configuration window you can set the output location and
sheet name. The additional setting options enable you to
overwrite an existing file and to define whether you want to write
the column headers / row ids into the first row / column of your Excel sheet.

Output location

Sheet name

Column headers
and row key

Note 1: To write multiple tables into different sheets you can add dynamic input
ports and define a sheet name for each input table.

17
Adding a Sheet to an Excel File Excel Writer Node

Excel KNIME Analytics Platform


To add a new sheet to an existing Excel file you have to click the plus below the The Excel Writer node can also add sheets to an existing Excel
table, next to the already existing sheets. File. In the configuration window you can set the location of the
existing Excel file and select “append” for the setting option “If
To add a table from another Excel file you can copy & paste the table manually. exists”. In the “Sheets” part you can define the new sheet name
and whether the node should overwrite or fail in case a sheet with
the defined sheet name exists already.

Location of the
existing File

Append option

Sheet settings

18
Data Types in Excel Data Types in KNIME

Excel KNIME Analytics Platform


The screenshot on the right shows you the KNIME Analytics Platform supports reading all data types from Excel, and even
different available datatypes in Excel. more. When reading a file, KNIME Analytics Platform tries to autodetect the
correct data type. If you want to change this during reading you can use the
Transformation tab of the Excel Reader node.

They can be mapped to the following data


types in KNIME Analytics Platform

Excel KNIME Analytics Platform


General Number or String
Number Number
Currency Number (Integer or double)
(e.g. 50,25 €) (e.g. 50,25)
Accounting Number (Integer or double)
(e.g. 50,25 €) (e.g. 50,25)
Short Date Local Date
Long Date Local Date
Time Local Time
Percentage Number (double)
(e.g. 50%) (e.g. 0,5) Note 1: The Transformation tab is also available in other reader nodes.
Fraction Number (double) Note 2: In the Transformation tab you can also rename, remove, and change the
Scientific Integer order of columns.
(e.g. 5,00E+02) (e.g. 500) Note 3: To change the data type later in the workflow you can use one of the
following nodes: String to Number, Number to String, or Table Manipulator.

19
Connect to a Database Database Connector Nodes

Excel KNIME Analytics Platform


You can connect to a database in Excel. A description about how to do this is Reading data from a database follows three steps: connect, select, and extract. The
available via the following link: workflow is built step by step with a Connector, a DB Table Selector, and a DB
http://www.erpsoftwareblog.com/2017/01/microsoft-excel-connections-sql- Reader node. Database nodes simply build the SQL query, they do not execute it.
databases/ Only the final node, e.g. the DB Reader node, executes the SQL query and extracts
the data.

A number of database
connector nodes are
available to connect to the
most commonly used
databases. However, the
DB Connector node allows
you to connect to all JDBC
compliant databases.

There are more database nodes to help build a SQL query for in database
processing. You can use them in between the DB Table Selector and the DB Reader
node.

20
Tips on Reading Data with KNIME Analytics Platform
All reader nodes require a path to the input file location. Let’s collect some Tips&Tricks Tip&Trick 3: Reading from another file system:
for this:
KNIME Analytics Platform allows you to connect and read from many
Tip&Trick 1: Use drag&drop from the KNIME Explorer: different sources / file system, e.g. Amazon S3, Microsoft SharePoint
Online, Databricks to name just a few. Three steps are necessary (the
Data files saved in the workspace folder are available in the KNIME Explorer panel (top file handling guide gives you further information).
left panel). To read in one of these files, you just drag&drop the file from the KNIME
Explorer panel to the workflow editor. KNIME automatically creates the correct reader c
node and sets the path of the input location. Step 1: Click “...” in the bottom left corner of the reader node icon to
add a File System Connection port
Tip&Trick 2: Different options to define a file path:

In KNIME we have different options to provide a file path. This becomes important when
you start sharing your workflows or exporting them to other KNIME Analytics Platform
installations or KNIME Servers. There are four default file systems available in KNIME
Analytics Platform.

• Local File System: Allows you to select a file/folder from your local system.
Step 2: Connect to the desired file system via the dedicated
• Mountpoint: You can connect to a KNIME Server or the KNIME Hub via additional connector node and connect it with the reader node
mountpoints in the KNIME Explorer. To read data from either LOCAL or another
mountpoint select “Mountpoint“. When selected, a new drop-down menu
appears so that you can choose the mountpoint. Unconnected mountpoints are
grayed out but can still be selected (note that browsing is disabled in this case).
Go to the KNIME Explorer and connect to the mountpoint to enable browsing.

• Relative to: Allows you to choose whether to resolve the path relative to the
current mountpoint, current workflow, or the current workflow's data area. When
Step 3: Select the file/folder in the connected file system
selected a new drop-down menu appears to choose which of the three options
to use.

• Custom/KNIME URL: Allows to specify a URL (e.g. file://, http:// or knime://


protocol). Browsing is disabled for this option.

21
Appending / Joining Data

22
Appending Data Concatenate Node

Excel KNIME Analytics Platform


To append data to a table select manually the area you want to append and copy
and paste the content below the first table.

Note 1: Before copying and pasting ensure that all tables have the same column
structure. The Concatenate node writes two or
more tables below each other.

Note 1: The order of the column


doesn’t have to be the same.
Note 2: You can decide whether you want the intersection or the union of the
columns.
Note 3: The Concatenate node has the option to add more input ports, to
concatenate more than two tables. To add an additional input port, click on the
three dots on the lower left of the node and select “Add input port”.

23
VLOOKUP Filter and Joiner Node

Excel KNIME Analytics Platform


The VLOOKUP function is used for different tasks. The most common uses are: 1. To look up a certain value, e.g. the price of a certain product

1. Look up a certain value, e.g. the price of a certain product.

2. Join columns based on a primary key (look up value), e.g. join product
information based on the product ID.

An alternative function for the second task is INDEX MATCH. Note 1: Your full original table is still available at the output port of the Table
Reader node. See more information about the Row Filter and Column Filter nodes
on pages 26 and 30.
2. Join columns based on a joining column, e.g. join product information based on
product ID.
Filter
Columns
Join column(s)
Columnsdsdffd

Join mode

Note 1: The “Joining Column” is the look up value column.

24
Filtering, Transformations,
and Aggregations

25
Filtering / Removing Rows with a Specific Value Row Filter

Excel KNIME Analytics Platform


To filter data select a random cell in the data In KNIME Analytics Platform there is no difference between
table, go to the Home tab, click the “Sort& filtering and removing rows, as the original table is not deleted
Filter” button and select “Filter”. and is still available at the output port of the previous node.

The Row Filter node filters the table based on a filter criteria,
e.g. by including / excluding all rows with a certain value in the filter column.

Filter column

Select the value you are interested in from


the drop down menu.
Value to in- or exclude

To remove rows, select the rows you want to delete, right click and select “Delete
Rows”.

Note 1: On the right you can choose whether you want to include or exclude the
rows with the matching value
Note 2: If you only interested in the rows with one specific value you can use the
Row Filter node.
Note 3: If you want to include rows based on different values you can use the
Rule-based Row Filter. (See next page).
Note 4: Further filter options are available, e.g. on a numerical range, filter rows by
row number or row ID, or missing values only.

26
Filtering / Removing Rows with Different Values Rule-based Row Filter

Excel KNIME Analytics Platform


To filter data select a random cell in the data tables, click on “Sort& Filter” and In KNIME Analytics Platform there is no difference between
select “Filter”. filtering and removing rows, as the original table is not deleted
and is still available at the output port of the previous node.

The Rule-based Row Filter node filters rows in or out according


to a set of rules. To include / exclude rows with two different values you can use
the following expression
$Column_Name$ = “Value 1” OR $Column_Name$ = “Value 2”=> TRUE

Select the values you are interested in from the drop down menu. Column List

List of functions

Expression

To remove rows you have to select the rows you want to delete, right click and
choose “Delete Rows”.

Note 1: At the bottom of the configuration window you can choose whether you
want to include or exclude TRUE matches.
Note 2: Columns are given by their name surrounded by $. Add them to the
expression frame by double clicking a column name in the Column List.
Note 3: The Rule-based Row Filter node has a number of different functions for
many advanced filter options.
Note 4: Different rows in the expression frame work like an OR conjunction.

27
Removing Duplicates Duplicate Row Filter

Excel KNIME Analytics Platform


To remove duplicates The Duplicate Row Filter node detects and treats duplicates. The
• Select the range of cells that might have duplicates that you want to default treatment removes duplicate rows like in Excel. The
remove. columns in the “Include” frame correspond to the selected
• Go to the Data tab and select “Remove Duplicates” columns for duplicate detection in Excel. This means rows that
have the same values in these columns are detected as
duplicates.

• Select the columns for duplicate detection and click “OK”

Note 1: In the “Advanced” tab you can change the treatment for duplicates, for
example to keep duplicate rows and to add a column showing which of the rows
are unique, chosen, or duplicates.

28
Sorting Rows by Multiple Key Columns Sorter Node

Excel KNIME Analytics Platform


To sort rows by multiple columns, select the columns by which you want to sort. With the Sorter node you can sort by one or more columns in
Then click on Sort&Filter and select “Custom Sort”. either ascending or descending order.

Note 1: You can add as many key-columns as you want by clicking the “Add Rule”
button.
Note 2: You can temporarily sort the output table of a node. Click on the column
header based on which you want to sort and select whether you want to sort
ascending or descending.

29
Removing Columns Column Filter Node

Excel KNIME Analytics Platform


To remove columns just select and delete the superfluous columns. With the Column Filter node you can delete superfluous
columns. Remember that the full data table will be still
Remember: If you remove a column used in a calculation field this will break your available at the output port of the previous node and for
formula. calculations you run upstream.

Note 1: You can use the arrow buttons in the middle to move columns from the
Include to the Exclude frame and vice versa.
Note 2: You can use the Wildcard/Regex Selection to automatically remove
columns by a name patter.
Note 3: You can use the Type Selection to automatically remove columns by data
type.

30
Reordering and Renaming Columns Column Resorter and Column Rename Node

Excel KNIME Analytics Platform


To reorder columns: To resort columns you can use the Column Resorter node. Define the desired
• Select the column header of the column you want to move order of the columns by selecting the one you want to move and using the
• Press ctrl + shift + down arrow “Actions” buttons on the right.
• Click and hold the green outline of the column you want to move
• Drag your column to the desired position

To rename a column just click on the column cell and change the cell value.
To rename columns you can use the Column Rename node. Double click the
column you want to rename, activate the checkbox “Change” and define the
column header in the textbox.

31
Changing Data Types String to Number and Number to String

Excel KNIME Analytics Platform


To change the data type of a column or a cell, select it and go to the home tab. To change the data type of a column you can use either the String to Number or
Then use the drop down menu in the middle. the Number to String nodes.

Note 1: In the String to Number node you can choose between different numerical
types, e.g Double, Integer, and Long.

32
Tip on Data Manipulation with KNIME Analytics Platform

Resort columns
Remove columns

The Table Manipulator node allows you to Change column name


perform a lot of the described transformations in
Change data type
one node. You can use it to:

• Concatenate multiple tables (after adding


dynamic input ports by clicking on … on
the lower left of the node)

• Filter, resort, and rename columns

• Change the data type of column

33
Data Aggregation

34
Pivot Tables Pivoting Node

Excel KNIME Analytics Platform


To create a pivot table in Excel The Pivoting node is configured via three tabs:
• Click on any single cell inside the data set • “Groups” defines the group columns, aka columns in
• Go to the Insert tab and click Pivot Table the “Rows”-frame in Excel (final row IDs)
• “Pivots” defines the pivoting columns, aka columns in
the “Columns”-frame in Excel (final column headers)
• “Manual Aggregation” defines the aggregation variables and the
aggregation methods
• Select table / range and output The “Value” setting option corresponds to the “Manual Aggregation” tab.
location The Pivoting node produces three output tables: the pivot table and the total
values for columns and rows.

Column Values
Row s
s

• Choose pivot table fields by dragging them


into “Columns”, “Rows” and “Values”.

Note 1: The Pivoting node doesn’t have “Filter” options, but you can simply use a
Row Filter node beforehand.
Note 2: In KNIME you have to choose at least one column for the Groups and
Pivots. In case you want to choose only “Rows” you can use the GroupBy node.

35
Pivot Table without Columns GroupBy Node

Excel KNIME Analytics Platform


Create a pivot table as described on the previous page and drag only columns into The GroupBy node is configured via two tabs:
the “Rows” and “Values” frames. • “Groups” defines the group columns, aka columns in
the “Rows” frame in Excel (final row IDs)
• “Manual Aggregation” defines the aggregation
variables and the aggregation methods
The “Value” setting option corresponds to the “Manual Aggregation” tab.

Values
Rows

36
Unpivot Unpivoting Node

Excel KNIME Analytics Platform


To create an unpivot table in Excel The Unpivoting node is configured via one tab:
• Store the dataset in a table. • In the upper section, “Value columns” defines the
• Select any cell in the table. column(s) to unpivot, aka column selection in Excel.
• Click the “Data” tab and select “From Table/Range”. • In the lower section, “Retained columns” defines the
column(s) that remains unchanged, aka unselected
columns in Excel.

• This opens the “Power Query Editor”. Select the columns to unpivot by
holding down the shift key.
• Click the “Transform” tab of the Power Query Editor and select “Unpivot
Columns”.

• Click the “Home” tab of the “Power Query Editor”, and select “Close &
Load” to save the data unpivoted back in the Excel workbook.

Note 1: The unpivot command is available without any additional downloads in


Excel 2016 for Windows. If you are using a different version, you may need to first
download the free Power Query add-in from the Microsoft site. Authoring in the Note 1: the Unpivoting node requires selecting at least one column in the “Value
Power Query Editor is not supported for Mac yet. columns” section.

37
Math Functions and
Text Functions

38
Math Functions Math Formula Node

Excel KNIME Analytics Platform


Here is a list of frequently used math formulas in Excel. You will find a translation The Math Formula node evaluates a mathematical expression
into the KNIME formulas on the next page: based on the values in a row. It has a long list of functions. The
• SUM table on the next page provides a translation of the most used
• AVERAGE Excel functions.
• MEDIAN Columns
• SUMPRODUCT
• ABS
• SUMIF
• Round functions
o ROUND
o ROUNDUP
Functions
o ROUNDDOWN

If you’re often used math formula is missing sent a message to …


Expression
n

Note 1: You can decide whether you want to append a new column or replace one
of the columns, by using the checkboxes underneath the Expression frame.
Note 2: By activating the checkbox “Convert to Int” you can ensure that your
output appended / replaced column is of type Integer.
Note 3: To perform the same mathematical expression on multiple columns you
can use the Math Formula (Multi Column) node.

39
Math Functions Math Formula Node

Excel KNIME Analytics Platform


SUM(number1;number2;…) COL_SUM(Col) = Sum of the values in the selected column.
Col1 + Col2. = Sum of the values in Col1 and the values in Col2 for
each row.

AVERAGE(number1;number2;…) COL_MEAN(col_name) = Average of the values in the selected column.


average(Col1, Col2,…) = Average of the values of the selected columns for
each row.

MEDIAN(number1;number2;…) COL_MEDIAN(Col_name) = Median of the values in the selected column.


Median(Col1, Col2,…) = Median of the values of the selected columns for
each row.

SUMPRODUCT(number1;number2;…) Sequence of two “Math Formula” nodes:


First one: Multiply the two columns using the expression Col1*Col2 and
Append a new column.
Second one: Use COL_SUM on the new column.

ABS(number1) abs(Col) = The absolute value for all values in the selected column

SUMIF(range; criteria) Sequence of two “Math Formula” nodes:


First one: Append new column with if(criteria, column to sum ,0)
Second one: Use COL_SUM on the new column

Round functions: round(Col, NumberOf Digits) = Number of digits is optional.


ROUND(number1, num_digits) ceil(Col*10^(num_digits))/10^num_digits
ROUNDUP(number1, num_digits) floor(Col*10^(num_digits))/10^num_digits
ROUNDDOWN(number1, num_digits)

40
Concatenation and Find& Replace String Manipulation Node

Excel KNIME Analytics Platform


1. The function CONCATENCATE( text1, text2,..) combines different texts into one The String Manipulation node manipulates columns of type
cell. String based on the defined expression. A lot of different
functions are available.
2. To find and replace a certain value you have to go to “Edit -> Find -> Replace…”
to open the dialog below.

In the dialog that opens, you can define the value you want to replace and the
value you want to replace it with.

1. CONCATENATE (text1, text2,..) from Excel

join(col1, col2 ,…) or join(col1, “your own string”,…)


Note 1: You can either join the values of different columns row wise, or add also
another String value to a column.

2. Find&Replace

Replace(string, search, replace)


Note 1: String is the String value where you want replace something. This can be a
column to apply the replace row wise or a fixed String value.
Note 2: To perform the same manipulation on multiple string columns you can
use the String Manipulation (Multi Column) node

41
Formatting Excel Tables

42
In chapter 1 we introduced the Excel Writer, which you
can use to write your result table into an Excel Sheet.
By default, this is a simple table without any
formatting like colors, border cells, etc. In this chapter,
we want to show you how to use the XLS Formatting
nodes of the community extension Continental Nodes
for KNIME. These nodes enable you to add formatting
instructions and advanced settings to already existing
XLS files, so that you can create Excel reports that
have the look and feel you used to.

The key to your formatted Excel sheet is an additional


XLS Control Table of the same size as the original file
with one or more comma separated tag values, e.g.
header, border, etc. Different XLS Formatter nodes
assign different formatting instructions to the cells
based on these tags, e.g. you can change the
Figure 1 On the left you can see an Excel table created by an Excel Writer node without formatting,
background color for all cells that are tagged “header”. and on the right a styled table after formatting information
Your formatting can then be applied to an already have been added with the XLS Formatter nodes, e.g. yellow background for the headers.
existing Excel sheet with the XLS Formatter (apply)
node.

Hint: Use a flow variable connection to make sure that


the Excel files are already written.

43
Figure 2 Bottom left you can see a control table with tag values, which is the key for your styled table. Based on
the tag values the yellow XLS formatting nodes collect formatting instructions, which are them applied by the
XLS Formatter (apply) node, producing the styled table (top right).

To summarize, this means we have to perform the following steps:

1. Write the table to an Excel Sheet


2. Create an XLS Control Table with tag values
3. Add formatting instructions based on tag values
4. Apply the formatting instructions to the existing Excel Sheet

44
This chapter is divided into two sections. The first section of this chapter
shows two ways of creating an XLS Control Table with tags. The second
section introduces some of the nodes that are available to add formatting
instructions.

Hint: You can’t find the nodes in your node repository? The Continental
Nodes for KNIME are a community extension that you can install by
dragging the extension from the KNIME Hub to KNIME Analytics Platform
or by installing the extension as described in this video.

Creating an XLS Control Table with Tag Values

As the saying goes, many roads lead to Rome. This section introduces
two different roads or approaches for creating an XLS Control Table.
(The second approach happens to be my personal favorite!) The “key
node” in both examples is the XLS Control Table Generator node.

Figure 3 The configuration dialog for the XLS Control Table Generator node

45
Approach 1: Table Creator + XLS Control Table Generator

The first approach to create an XLS Control Table with tabs involves a
combination of a Table Creator and an XLS Control Table Generator node.
This is an easy approach, however the downside is that it entails a lot
manual work creating the tag table and you have the problem that the tag
table is static. Therefore, this approach is only recommended for small
tables, where the number of rows and columns won’t change.

Open the configuration window of the Table Creator node to add one or
multiple tag values for each cell. If you want to enter multiple tags,
remember to separate them with a comma. The XLS Control Table Here you can see one option to create a control table using the
Generator node transforms the table into an XLS Control table and Table Creator node and the XLS Control Table Generator node
replaces the column names with letters and the row IDs with numbers.
The checkbox “write column header to first row” gives you the option of
retaining the column headers, similar to the option “add column headers”
in the Excel Writer node.

Approach 2: XLS Control Table Generator + Rule Engine

The second approach we want to show uses a combination of two XLS


Control Table Generator nodes with the unpivot option activated and a Rule
Engine node.

Activate the checkbox “unpivot result table” in the configuration window of


the first XLS Control Table Generator node to output a table that has one row
for each cell in the input table, including the value, the row number, the Figure 4 Here you can see another option to create a control
table using a combination of , where the Rule Engine node is
column header, and more.
used to define tag values based on the row and column number.

46
Figure 5 Here you can see the resulting table when activating the checkbox “unpivot result
table” in the XLS Control Table Generator node. The node creates one row for each cell
including value, row number, column header, etc.

This table is a great basis to now transform values into tags with the Rule
Engine node. For example, we can replace all values in the first row with the
tag “header”, or replace all values in the first column that have a row number
higher than 3 three with the tag value “cw”.

Hint: Activate the checkbox “Replace Column” and select the column “Value”.

47
Figure 6 On the left is the configuration dialog of the Rule Engine node. In the Expression section you can see
defined rules to replace the original values with tags, based on the row and column number. On the left you can
see the output table where the rules are applied, and the values are replaced with the different tags

A second XLS Control Table Generator node can transform this table back into its original
form, where the values are replaced with the different tag. This feature is automatically
activated when the node detects an input table that was created by an XLS Control Table
Generator node in unpivot mode.

This approach involves much less manual work compared to the first approach and can be
implemented in a way to handle changing table dimensions gracefully.

Tip: If you want to write multiple tables with changing tables spec below each other you can
create for each table a XLS Control Table and concatenate them afterwards.

Hint: Another helpful node to create a static XLS Control Tables is the XLS Control Table
from Cell Range node.
48
Adding Formatting Actions based on Tag Values

The next step to adding background colors or a border, etc. to your table is a sequence of
XLS Formatting nodes, similar to the workflow in figure 2.

As you can see all nodes in the example workflow have two input ports and one output port:

• a data input port


• an optional XLS Formatter input port (square with green border)
• an output port, which is an XLS Formatter port

The green square is a special port type of the extension, which collects the different
formatting instructions. The data input port expects the table with the tag values.
The optional input port can be used to feed an XLS Formatting table with previous
formatting instructions to which the instructions of the node should be added.

The figure on the right shows you an overview of all the nodes in the Continental extension. I
will introduce my favorite ones and leave it up to you to explore the others.

49
The XLS Background Colorizer node changes the The XLS Conditional Formatter node changes the
background color of cells. You can assign either a background for the cells with a certain tag value
static color and / or pattern fill. One option is to assign according to their numerical value. In the configuration
the same color to all cells with a specific tag value, e.g. window you can define a color scale by setting a
all cells with tag “header” should have a yellow minimum and maximum value and assigning a color to
background. Another option is to use RGB values in each. Optionally you could set a mid point value and
either hex syntax #FFD800 or decimal syntax R/G/B as assign a color to that. Cells with values higher or lower
tags and use them as the background color. than the thresholds will have the background color of
the minimum /maximum value.

The XLS Sheet Selector and the XLS Merger node are The XLS Border Formatter node can add borders to a
really helpful nodes if your EXCEL file has more than given range specified by a certain tag or by all tags. By
one sheet. By default the formatting is always applied activating the corresponding checkboxes, you can add
to the first sheet. So, if you have an Excel file with only borders to the top, right, bottom, and left. In addition to
one sheet you don’t have to worry about these two adding a border around the range specified by the tags,
nodes. However, if you have multiple sheets the XLS the node gives you the option to use inner vertical and
Sheet Selector allows you to define which sheet your horizontal boarder lines in each cell, too.
XLS Control table is for.

50
The XLS Formatter (apply) node reads an unformatted The XLS Cell Merger node merges the cells for given
Excel file, applies all the collected formatting rectangular ranges of input tags into one cell. For
instructions, and saves the nice Excel file in the defined example, we can merge all cells in the first row and
output location. centralize the title with the XLS Font Formatter node.
This node works only on strictly rectangular ranges.
The value of the merged cell is the value of the most
top left cell of the merged range.

The XLS Format Merger node allows you to either combine formatting instructions for different sheets prior to
using the XLS Formatter (apply) node or when applied to the same sheet, it merges the properties at the lowest
detail level (e.g. the formatting instructions for the cell A1 is bold in control table one and italic in control table two.
The subsequent formatting instruction for A1 is italic and bold). Thereby, the upper input port overwrites a lower
one in case of conflicting information (e.g. two different font colors for the same cell).

This was a short introduction. You can find further information about the
different XLS Formatter nodes in the Continental extension in the
documentation https://www.knime.com/community/continental-nodes-for-
knime-xls-formatter or, from within KNIME Analytics Platform, by looking in the
node description of each individual node.

51
The KNIME Booklet for Excel Users
Are you an experienced Excel user and want to start using KNIME Analytics Platform?

It’s sometimes difficult to switch from one software tool to another. But this booklet is the
perfect starting point as it maps the most commonly used Excel functions and techniques
to their KNIME equivalents. Find out, for example, how data reading, filtering, sorting, and
vlookup work in KNIME.

For a complete introduction to KNIME, please refer to my book “KNIME Beginner’s Luck”
available from KNIME Press under https://www.knime.com/knimepress

About the Author

Kathrin Melcher is currently a Data Scientist at KNIME. She holds an MSc in Mathematics,
from the University of Konstanz, Germany. She joined the KNIME Evangelism team in May
2017 and has a strong interest in data science, machine learning, and algorithms. She enjoys
teaching and sharing her knowledge on these topics.

52

You might also like