Friday, July 7, 2017

Apache Nifi


Apache NIFI is a data flow tool

In case of any warning take the cursor on top of the exclamation mark and this shows the error.

1. GetFile - Gets the file from the local file system and puts in the flow

Specify Input Directory

2. PutFile - Writes data to Local file system

Specify Directory

3. SplitText - Splits the data as per lines specified/data size specified

Specify Line Split Count/ Maximum Fragment Size

4. Extract Text - Extracts values from the text as per the RegEx

give any variable and specify RegEx. values would be variable.1,variable.2 etc

eg: fieldval - (.+),(.+),(.+),(.+)

then each values are fieldval.1,fieldval.2,fieldval.3,fieldval.4

5. PutHDFS - Writes data to HDFS

Specify Hadoop Configuration Resources paths of hdfs-site.xml and core-site.xml
Directory output HDFS Directory location

Note: Make sure that full permissions are applied to HDFS O/P path
Eg:  hadoop fs -chmod -R 777 /user/hadoop/Leela/Nifi1_out

References:

https://www.youtube.com/watch?v=4yBc7hHvaQU

1 comment:

  1. Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.

    https://www.emexotechnologies.com/online-courses/big-data-hadoop-training-in-electronic-city/

    ReplyDelete