본문 바로가기

[Data Engineering]/[Gloud-GCP]

[GCP] 4-2. Stream Processing

728x90

실시간 streaming에 관한 내용은 추후 gcp 관련 개인 프로젝트를 포스팅 하며 보충하도록 하겠다. 

 < 실습 - Data Streaming >

 

- 1 -

 cd realtime
 ./run_on_cloud.sh <BUCKET-NAME>  # 파일명이 변경되어 run_oncloud.sh로 실행시킨다
#!/bin/bash

if [ "$#" -ne 1 ]; then
    echo "Usage: ./run_on_cloud.sh  bucket-name"
    exit
fi

PROJECT=$(gcloud config get-value project)
BUCKET=$1

cd chapter4

bq rm flights.streaming_delays   # delete existing table

mvn compile exec:java \
 -Dexec.mainClass=com.google.cloud.training.flights.AverageDelayPipeline \
      -Dexec.args="--project=$PROJECT \
      --stagingLocation=gs://$BUCKET/staging/ \
      --gcpTempLocation=gs://$BUCKET/staging/tmp \
      --averagingInterval=60 \
      --speedupFactor=30 \
      --runner=DataflowRunner"

cd ..

- 2 -

cd simulate
 python3 ./simulate.py --startTime '2015-05-01 00:00:00 UTC' --endTime '2015-05-04 00:00:00 UTC' --speedFactor=30 --project $DEVSHELL_PROJECT_ID

- 3 -

- 4 -

 

 

 

 

"Data Science on the Google Cloud Platform by Valliappa Lakshmanan (O'Reilly). Copyright 2018 Google Inc."

 

728x90

'[Data Engineering] > [Gloud-GCP]' 카테고리의 다른 글

[GCP] 5-2. Cloud Data LAB  (4) 2020.02.24
[GCP] 5-1. Bigquery, Data Loading  (0) 2020.02.19
[GCP] 4-1. ETL Pipeline  (0) 2020.02.19
[GCP] 3-3. DashBoard  (0) 2020.02.18
[GCP] 3-2. Decision Model  (0) 2020.02.18