How to Upload a Csv File to Cockroachdb
The IMPORT INTO argument imports CSV, Avro, or delimited data into an existing table, by appending new rows into the table.
Considerations
-
IMPORT INTOworks for existing tables. To import data into new tables, read the following Import into a new table from a CSV file case. -
IMPORT INTOtakes the tabular array offline earlier importing the data. The table volition be online again once the job has completed successfully. -
IMPORT INTOcannot be used during a rolling upgrade. -
IMPORT INTOis a blocking statement. To run anIMPORT INTOjob asynchronously, use theDetachedoption. -
IMPORT INTOinvalidates all foreign keys on the target tabular array. To validate the strange primal(southward), use theVALIDATE CONSTRAINTstatement. -
IMPORT INTOis an insert-only statement; it cannot be used to update existing rows—seeUPDATE. Imported rows cannot conflict with principal keys in the existing table, or any otherUNIQUEconstraint on the table. -
IMPORT INTOdoes not offerSELECTorWHEREclauses to specify subsets of rows. To do this, utiliseINSERT. -
IMPORT INTOvolition cause any changefeeds running on the targeted tabular array to fail.
New in v21.two: IMPORT INTO now supports importing into REGIONAL BY ROW tables.
Required privileges
Table privileges
The user must take the INSERT and Drop privileges on the specified tabular array. (DROP is required because the table is taken offline during the IMPORT INTO.)
Source privileges
The source file URL does not require the ADMIN role in the post-obit scenarios:
- S3 and GS using
SPECIFIED(and nonIMPLICIT) credentials. Azure is everSPECIFIEDby default. - Userfile
The source file URL does require the ADMIN role in the following scenarios:
- S3 or GS using
IMPLICITcredentials - Utilize of a custom endpoint on S3
- Nodelocal, HTTP, or HTTPS
Learn more about deject storage for bulk operations.
Synopsis
Note:
While importing into an existing table, the table is taken offline.
Parameters
| Parameter | Description |
|---|---|
table_name | The name of the tabular array you desire to import into. |
column_name | The table columns you want to import. Note: Currently, target columns are not enforced. |
file_location | The URL of a CSV or Avro file containing the table data. This tin be a comma-separated list of URLs. For an example, run into Import into an existing table from multiple CSV files below. |
<choice> [= <value>] | Control your import'due south behavior with import options. |
Delimited information files
The DELIMITED DATA format can be used to import delimited data from whatever text file blazon, while ignoring characters that need to exist escaped, like the following:
- The file's delimiter (
\tpast default) - Double quotes (
") - Newline (
\n) - Carriage return (
\r)
For examples showing how to use the DELIMITED Information format, see the Examples section below.
Import options
You can control the IMPORT process'south beliefs using any of the post-obit key-value pairs as a <pick> [= <value>].
| Fundamental | Context | Value |
|---|---|---|
delimiter | CSV Information | The unicode grapheme that delimits columns in your rows. Default: , . Instance: To use tab-delimited values: |
comment | CSV DATA | The unicode character that identifies rows to skip. Example: |
nullif | CSV DATA, DELIMITED DATA | The string that should exist converted to NULL. Example: To utilize empty columns every bit Zilch: |
skip | CSV DATA, DELIMITED Data | The number of rows to be skipped while importing a file. Default: '0' . Example: To import CSV files with column headers: |
decompress | General | The decompression codec to exist used: gzip, bzip, auto, or none. Default: 'auto' , which guesses based on file extension (.gz, .bz, .bz2). none disables decompression. Example: |
rows_terminated_by | DELIMITED Information | The unicode graphic symbol to indicate new lines in the input file. Default: \n Example: |
fields_terminated_by | DELIMITED DATA | The unicode character used to carve up fields in each input line. Default: \t Instance: |
fields_enclosed_by | DELIMITED DATA | The unicode character that encloses fields. Default: " Example: |
fields_escaped_by | DELIMITED Data | The unicode character, when preceding ane of the above DELIMITED DATA options, to be interpreted literally. Example: |
strict_validation | AVRO DATA | Rejects Avro records that do not have a ane-to-one mapping between Avro fields to the target CockroachDB schema. By default, CockroachDB ignores unknown Avro fields and sets missing SQL fields to NULL. CockroachDB volition also attempt to convert the Avro field to the CockroachDB [data type][datatypes]; otherwise, information technology will study an mistake. Example: |
records_terminated_by | AVRO Data | The unicode character to indicate new lines in the input binary or JSON file. This is not needed for Avro OCF. Default: \n Case: To employ tab-terminated records: |
data_as_binary_records | AVRO DATA | Use when importing a binary file containing Avro records. The schema is not included in the file, and then you need to specify the schema with either the schema or schema_uri option. Example: |
data_as_json_records | AVRO DATA | Use when importing a JSON file containing Avro records. The schema is not included in the file, so y'all need to specify the schema with either the schema or schema_uri choice. Example: |
schema | AVRO Information | The schema of the Avro records included in the binary or JSON file. This is not needed for Avro OCF. Come across data_as_json_records instance above. |
schema_uri | AVRO Data | The URI of the file containing the schema of the Avro records include in the binary or JSON file. This is non needed for Avro OCF. See data_as_binary_records example above. |
DETACHED | N/A | When an import runs in Discrete fashion, it will execute asynchronously and the task ID will exist returned immediately without waiting for the job to stop. Notation that with Detached specified, further chore information and the task completion status will not be returned. To check on the job condition, apply the Testify JOBS statement. To run an import inside a transaction, use the |
For examples showing how to utilise these options, meet the IMPORT - Examples section.
For instructions and working examples showing how to drift data from other databases and formats, see the Migration Overview. For data on how to import information into new tables, come across IMPORT.
Requirements
Prerequisites
Before using IMPORT INTO, yous should accept:
-
An existing table to import into (use
CREATE TABLE).IMPORT INTOsupports computed columns and theDEFAULTexpressions listed below. -
The CSV or Avro information you want to import, preferably hosted on deject storage. This location must be equally accessible to all nodes using the same import file location. This is necessary because the
IMPORT INTOstatement is issued once by the client, but is executed meantime across all nodes of the cluster. For more than information, encounter the Import file location department below.
Supported DEFAULT expressions
IMPORT INTO supports computed columns and the following DEFAULT expressions:
-
DEFAULTexpressions with user-defined types. -
Constant
DEFAULTexpressions, which are expressions that return the same value in different statements. Examples include:- Literals (booleans, strings, integers, decimals, dates)
- Functions where each argument is a abiding expression and the functions themselves depend solely on their arguments (e.g., arithmetics operations, boolean logical operations, string operations).
-
Current
TIMESTAMPfunctions that record the transaction timestamp, which include:-
current_date() -
current_timestamp() -
localtimestamp() -
now() -
statement_timestamp() -
timeofday() -
transaction_timestamp()
-
-
random() -
gen_random_uuid() -
unique_rowid() -
nextval()
Available storage
Each node in the cluster is assigned an equal part of the imported information, and so must have enough temp space to store it. In addition, information is persisted as a normal tabular array, and then in that location must also be enough infinite to concur the final, replicated information. The node's outset-listed/default store directory must have plenty available storage to concur its portion of the data.
On cockroach start, if you set --max-disk-temp-storage, it must besides exist greater than the portion of the data a node volition shop in temp space.
Import file location
CockroachDB uses the URL provided to construct a secure API phone call to the service you specify. The URL structure depends on the blazon of file storage y'all are using. For more than data, see the following:
- Use Deject Storage for Bulk Operations
- Use a Local File Server for Majority Operations
Operation
All nodes are used during the import task, which means all nodes' CPU and RAM will be partially consumed by the IMPORT task in addition to serving normal traffic.
For more than particular on optimizing import operation, see Import Functioning Best Practices.
Viewing and controlling import jobs
After CockroachDB successfully initiates an import into an existing tabular array, it registers the import as a chore, which y'all can view with SHOW JOBS.
After the import has been initiated, y'all tin can control information technology with Break Task, RESUME JOB, and Cancel Chore.
If initiated correctly, the statement returns when the import is finished or if it encounters an fault. In some cases, the import can continue afterwards an error has been returned (the error message will tell you lot that the import has resumed in background).
Warning:
Pausing and so resuming an IMPORT INTO job will cause information technology to restart from the beginning.
Examples
The following provide connection examples to cloud storage providers. For more information on connecting to dissimilar storage options, read Apply Cloud Storage for Bulk Operations.
We recommend reading the Considerations department for important details when working with IMPORT INTO.
Import into a new table from a CSV file
To import into a new tabular array, use CREATE TABLE followed by IMPORT INTO.
Annotation:
Every bit of v21.two IMPORT TABLE volition be deprecated; therefore, we recommend using the following example to import data into a new tabular array.
Kickoff, create the new tabular array with the necessary columns and information types:
CREATE Table users ( id UUID PRIMARY Central , metropolis String , name String , address STRING , credit_card STRING ); Side by side, employ IMPORT INTO to import the data into the new table:
IMPORT INTO users ( id , city , proper noun , accost , credit_card ) CSV Information ( 's3://{BUCKET Proper noun}/{customers.csv}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS Primal}' ); Import into an existing table from a CSV file
> IMPORT INTO customers ( id , name ) CSV DATA ( 's3://{Saucepan NAME}/{customers.csv}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' ); Note:
The column order in your IMPORT statement must match the column guild in the CSV beingness imported, regardless of the order in the existing tabular array's schema.
Import into an existing table from multiple CSV files
> IMPORT INTO customers ( id , proper noun ) CSV DATA ( 's3://{BUCKET Proper name}/{customers.csv}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS Key}' , 's3://{Saucepan NAME}/{customers2.csv}?AWS_ACCESS_KEY_ID={Fundamental ID}&AWS_SECRET_ACCESS_KEY={Hush-hush Access KEY}' , 's3://{Bucket Proper name}/{customers3.csv}?AWS_ACCESS_KEY_ID={Key ID}&AWS_SECRET_ACCESS_KEY={Undercover Admission KEY}' , 's3://{BUCKET NAME}/{customers4.csv}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' , ); Import into an existing table from an Avro file
Avro OCF data, JSON records, or binary records can exist imported. The post-obit are examples of importing Avro OCF data.
To specify the table schema in-line:
> IMPORT INTO customers AVRO DATA ( 's3://{BUCKET Name}/{customers.avro}?AWS_ACCESS_KEY_ID={Cardinal ID}&AWS_SECRET_ACCESS_KEY={Hugger-mugger Access KEY}' ); For more information near importing data from Avro, including examples, see Drift from Avro.
Import into an existing table from a delimited data file
> IMPORT INTO customers DELIMITED DATA ( 's3://{Bucket Proper noun}/{customers.csv}?AWS_ACCESS_KEY_ID={Primal ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS Key}' ) WITH fields_terminated_by = '|' , fields_enclosed_by = '"' , fields_escaped_by = ' \' ; Import into a new table from a CSV file
To import into a new table, utilise CREATE Table followed past IMPORT INTO.
Note:
Annotation that equally of v21.two IMPORT TABLE will exist deprecated; therefore, we recommend using the following example to import information into a new table.
Start, create the new tabular array with the necessary columns and information types:
CREATE TABLE users ( id UUID Master Key , city STRING , proper noun STRING , accost Cord , credit_card Cord ); Next, use IMPORT INTO to import the data into the new table:
IMPORT INTO users ( id , city , name , address , credit_card ) CSV Data ( 'azure://{CONTAINER NAME}/{customers.csv}?AZURE_ACCOUNT_NAME={ACCOUNT Proper name}&AZURE_ACCOUNT_KEY={ENCODED Central}' ); Import into an existing table from a CSV file
> IMPORT INTO customers ( id , proper name ) CSV DATA ( 'azure://{CONTAINER Proper name}/{customers.csv}?AZURE_ACCOUNT_NAME={ACCOUNT NAME}&AZURE_ACCOUNT_KEY={ENCODED KEY}' ); Note:
The column order in your IMPORT statement must match the column order in the CSV being imported, regardless of the lodge in the existing table's schema.
Import into an existing tabular array from multiple CSV files
> IMPORT INTO customers ( id , proper noun ) CSV DATA ( 'azure://{CONTAINER Name}/{customers.csv}?AZURE_ACCOUNT_NAME={ACCOUNT NAME}&AZURE_ACCOUNT_KEY={ENCODED KEY}' , 'azure://{CONTAINER NAME}/{customers2.csv}?AZURE_ACCOUNT_NAME={ACCOUNT Proper noun}&AZURE_ACCOUNT_KEY={ENCODED KEY}' , 'azure://{CONTAINER NAME}/{customers3.csv}?AZURE_ACCOUNT_NAME={Business relationship Name}&AZURE_ACCOUNT_KEY={ENCODED Central}' , 'azure://{CONTAINER NAME}/{customers4.csv}?AZURE_ACCOUNT_NAME={Account Proper name}&AZURE_ACCOUNT_KEY={ENCODED Central}' , 'azure://{CONTAINER Proper name}/{customers5.csv}?AZURE_ACCOUNT_NAME={Account Proper noun}&AZURE_ACCOUNT_KEY={ENCODED KEY}' , ); Import into an existing table from an Avro file
Avro OCF data, JSON records, or binary records tin can be imported. The following are examples of importing Avro OCF data.
To specify the table schema in-line:
> IMPORT INTO customers AVRO Information ( 'azure://{CONTAINER NAME}/{customers.avro}?AZURE_ACCOUNT_NAME={ACCOUNT Name}&AZURE_ACCOUNT_KEY={ENCODED KEY}' ); For more than information nearly importing data from Avro, including examples, run into Migrate from Avro.
Import into an existing tabular array from a delimited data file
> IMPORT INTO customers DELIMITED DATA ( 'azure://{CONTAINER NAME}/{customers.csv}?AZURE_ACCOUNT_NAME={ACCOUNT Proper noun}&AZURE_ACCOUNT_KEY={ENCODED KEY}' ) WITH fields_terminated_by = '|' , fields_enclosed_by = '"' , fields_escaped_by = ' \' ; Note:
The examples in this section use the AUTH=specified parameter, which volition be the default behavior in v21.two and beyond for connecting to Google Cloud Storage. For more item on how to pass your Google Deject Storage credentials with this parameter, or, how to apply implicit authentication, read Utilise Cloud Storage for Bulk Operations — Authentication.
Import into a new table from a CSV file
To import into a new table, use CREATE TABLE followed by IMPORT INTO.
Note:
Note that every bit of v21.two IMPORT Tabular array volition be deprecated; therefore, nosotros recommend using the following example to import information into a new table.
First, create the new table with the necessary columns and information types:
CREATE TABLE users ( id UUID PRIMARY KEY , city STRING , name String , address STRING , credit_card String ); Side by side, utilise IMPORT INTO to import the data into the new table:
IMPORT INTO users ( id , city , name , address , credit_card ) CSV Information ( 'gs://{Saucepan NAME}/{customers.csv}?AUTH=specified&CREDENTIALS={ENCODED KEY}' ); Import into an existing table from a CSV file
> IMPORT INTO customers ( id , name ) CSV DATA ( 'gs://{BUCKET Proper noun}/{customers.csv}?AUTH=specified&CREDENTIALS={ENCODED KEY}' ); Notation:
The column order in your IMPORT statement must match the cavalcade social club in the CSV being imported, regardless of the order in the existing table'south schema.
Import into an existing table from multiple CSV files
> IMPORT INTO customers ( id , proper noun ) CSV Information ( 'gs://{BUCKET Proper noun}/{customers.csv}?AUTH=specified&CREDENTIALS={ENCODED Cardinal}' , 'gs://{Bucket Name}/{customers2.csv}?AUTH=specified&CREDENTIALS={ENCODED KEY}' , 'gs://{Bucket NAME}/{customers3.csv}?AUTH=specified&CREDENTIALS={ENCODED Fundamental}' , 'gs://{Bucket NAME}/{customers4.csv}?AUTH=specified&CREDENTIALS={ENCODED KEY}' , ); Import into an existing table from an Avro file
Avro OCF data, JSON records, or binary records can exist imported. The post-obit are examples of importing Avro OCF data.
To specify the table schema in-line:
> IMPORT INTO customers AVRO DATA ( 'gs://{Saucepan NAME}/{customers.avro}?AUTH=specified&CREDENTIALS={ENCODED Key}' ); For more than information nigh importing data from Avro, including examples, meet Drift from Avro.
Import into an existing table from a delimited information file
> IMPORT INTO customers DELIMITED Data ( 'gs://{BUCKET NAME}/{customers.csv}?AUTH=specified&CREDENTIALS={ENCODED Central}' ) WITH fields_terminated_by = '|' , fields_enclosed_by = '"' , fields_escaped_by = ' \' ; Known limitations
- You cannot import into a tabular array with fractional indexes.
- While importing into an existing table, the table is taken offline.
- After importing into an existing table, constraints will be un-validated and need to exist re-validated.
- Imported rows must not conflict with existing rows in the table or whatever unique secondary indexes.
-
IMPORT INTOworks for only a single existing table. -
IMPORT INTOcan sometimes neglect with a "context canceled" error, or can restart itself many times without ever finishing. If this is happening, it is probable due to a high amount of disk contention. This can exist mitigated by setting thekv.bulk_io_write.max_ratecluster setting to a value below your max deejay write speed. For example, to prepare it to 10MB/s, execute:> SET CLUSTER SETTING kv . bulk_io_write . max_rate = '10MB' ;
See also
-
IMPORT - Migration Overview
- Use Cloud Storage for Bulk Operations
- Import Performance Best Practices
Was this page helpful?
Yes
No
Source: https://www.cockroachlabs.com/docs/stable/import-into.html
0 Response to "How to Upload a Csv File to Cockroachdb"
Postar um comentário