Session 17 Snowflake Snowpipe
Session 17 Snowflake Snowpipe
-------
- WHAT IS CONTINOUS LOADING
- SNOWPIPE
- STEPS IN CREATING PIPE
- SNOWPIPE SYNTAX
- SNOWPIPE DDL
- TROUBLESHOOTING SNOWPIPE
- MANAGING SNOWPIPE
SNOWPIPE:
----------
- A pipe is named database object that contains Copy command used to load the
data.
- Snowpipe loads data within minutes after files are added to a stage and
submitted for ingestion.
- Snowpipe uses compute resources provided by snowflake,it is a server less task.
- One time setup.
- Suggested micro file size is 100 - 250 MB.
- Snowpipe uses file loading metadata associated with each pipe object to
privent reloading the same files.
Snowpipe Syntax:
----------------
CREATE OR REPLACE PIPE PIPE_NAME
AUTO_INGEST= [TRUE|FALSE]
AS
<Copy_Statement>
select system$pipe_status('pipe_name');
lastreceivedmessage Timestamp:
------------------------------
- Specifies the timestamp of the last event message received from the message
queue.
- If the timestamp is erleir than expected ,this indicates an issue with the
service configuration(i.e Amazon SQS)
- Verify whether any settings were changed in your servcie configuration.
lastfarwardedmessage Timestamp:
-------------------------------
- Specifies the timestamp of the last "crete object" event message that was
farworded to the pipe.
- If this value is not similar to above timestamp,then there is mismatch between
the cloud storage
Path where the new data files are created and the path specified in the
snowflake stage object
- Verify the paths and correct them.
MANAGING PIPES:
---------------
- Use DESC pipe_name command to see the pipe properties and the copy command.
- Use SHOW pipes command to see all the pipes.
- We can pause/resume pipes with PIPE_EXECUTION_PAUSED= true/false.
- It is best practice to pause and resume pipes before and after performing below
actions.
- when modifying the stage object.
- when mdofifying file format object if stage is using.
- when modyfying Copy command.
- To modifying the copy command ,recreating pipe is the only possible way
- When you recreate a pipe,all the load history will be dropped.
-----------------------------------------------------------------------------------
---------------------------------------------------------------
-Login into AWS account and S3 bucket and create folder pipes
Amazon S3>Bucket>awss3bucketjana>pipes/csv
(or)
LIST @mydb.externals_stages.stage_aws_pipes
// Create a pipe
CREATE OR REPALCE PIPE MYDB.PIPES.EMPLOYEE_PIPE
AUTO_INGEST = TRUE
AS
COPY INTO MYDB.PUBLIC.emp_data
FROM @mydb.externals_stages.stage_aws_pipes
pattern = '.*employee.*'
DESC pipe_employee_pipe;
//GET Notification channel ARN and update the same in event notifications SQS queue
Go to amazon S3 bucket:
--------------------------
EVENT_NAME(snowfile_employee)> Prefix-optional(pipes/csv)>sufix-optional(ignore
it)>
EVENT_TYPES(Object creation [enable(All object create events)]>DESTINATION(enable
SQS radio buttion)>
specify SQS Queue[enable enter SQS Queue ARN radio button and notification chaneel
from snow pipe and paste in SQS QUeue in AWS and do save changes.
//Upload the file and verify the data in the table after a minute
************************************************************************END********
*************************************************************************