I would like to set environment variables for AWS key ID and secret. This is essentially the same question as was asked here. However, that thread did not seem to include a final answer.
Following the answers in that thread I ran the following command:
dbutils.fs.put("dbfs:/databricks/init/init.bash" ,""" #!/bin/bash export AWS_ACCESS_KEY_ID=ACCESS_KEY export AWS_SECRET_KEY=SECRET_KEY sudo echo AWS_ACCESS_KEY_ID=ACCESS_KEY >> /etc/environment sudo echo AWS_SECRET_KEY=SECRET_KEY >> /etc/environment """, true)
This seemed to run the script during cluster initialization as I was able to see the lines for the AWS keys in /etc/environment. However, the AWS environment variables did not show up when I run
or when I run
So although the script seems to run at init time, it does not seem to have the intended effect of setting the environment variables. It seems some of the posters in the previous thread also had this problem, but I never saw an answer to that.
Must I use some command other than 'export' in my init script in order to register the environment variables.
Also, the suggestion in the previous thread to echo to environment variable assignments to /etc/environment was new to me. Are processes supposed to pick up environment variable settings from that file?
Answer by Tim Ryan · Mar 20 at 06:30 PM
After some experimentation I was able to achieve the desired results by appending my environment variable declarations to the file /databricks/spark/conf/spark-env.sh. I changed the init file as followsdbutils.fs.put("dbfs:/databricks/init/init.bash" ,""" #!/bin/bash
sudo echo export AWS_ACCESS_KEY_ID=ACCESS_KEY >> /databricks/spark/conf/spark-env.sh sudo echo export AWS_SECRET_ACCESS_KEY=SECRET_KEY >> /databricks/spark/conf/spark-env.sh