I keep getting the following error in bash (using windows):
$ URL = "https://nyc-tlc.s3.amazonaws.com/trip+data/yellow_tripdata_2022-01.parquet"
bash: URL: command not found
The full command i am trying to run in bash is:
URL = "https://nyc-tlc.s3.amazonaws.com/trip+data/yellow_tripdata_2022-01.parquet"
python ingest_data.py
--user=root
--password=root
--host=localhost
--port=5432
--db=ny_taxi
--table_name=yellow_taxi_trips
--url=${URL}
Running the full command runs the ingest_data.py file but no download happens (I uassume because of this URL: command not found
error.
If I run ingest_data.py then nothing happens - similar to running the full command above.
The 'important' part of ingest_data.py file for using the URL is os.system(f"wget {url} -0 {flatfile_parquet}")
:
def main(params):
user = params.user
password = params.password
host = params.host
port = params.port
db = params.db
table_name = params.table_name
url = params.url
flatfile_parquet = 'output.parquet'
# download the parquet file
os.system(f"wget {url} -0 {flatfile_parquet}")
# connect to server
engine = create_engine(f'postgresql://{user}:{password}@{host}:{port}/{db}')
I keep finding threads that say to use curl but I dont want to download the file in bash, I want to store the url in a variable and then use it in the ingest_data.py
file.
Any advice is appreciated, i'm kind of a noob with bash
=
when assigning a value to a variable. – doneal24 Jun 22 '22 at 16:29