The very first thing we need is a file. For this demo, this is our sample file : A comma-separated file with 3 columns and 3 rows. So now that we have that, the first step would be to actually upload the file over to databricks. Get the cluster running. Go to Data tab Click on "Create Table" Drag the file to the specified space. At this point, the file will be uploaded to the DBFS storage. After this, we can work with the file. Copy the location of the file. (Here: /FileStore/tables/SampleData-1.csv) Now, let's get back to the notebook to work with the file. Creating a dataframe from the file dataframe_var=spark.read.option("header","true").csv('/FileStore/tables/SampleData-1.csv') Display the dataframe We can see that the file data is successfully displayed in the dataframe. We will take a look at how to load files to tables in the next blog.
Comments
Post a Comment