MySQL - Speed of INSERT Queries ?
The time to insert a record consists approximately of:
Sending query to server: (2)
Parsing query: (2)
Inserting record: (1 x size of record)
Inserting indexes: (1 x number of indexes)
where the numbers are somewhat proportional to the
overall time. This does not take into consideration the
initial overhead to open tables (which is done once for
each concurrently running query).
The size of the table slows down the insertion of
indexes by N log N (B-trees).
Some ways to speed up inserts:
If you are inserting many rows from the same client at
the same time, use multiple value lists INSERT
statements. This is much faster (many times in some
cases) than using separate INSERT statements.
If you are inserting a lot of rows from different
clients, you can get higher speed by using the INSERT
Note that with MyISAM you can insert rows at the same
time SELECTs are running if there are no deleted rows in
the tables. When loading a table from a text file, use
LOAD DATA INFILE. This is usually 20 times faster than
using a lot of INSERT statements.
It is possible with some extra work to make LOAD DATA
INFILE run even faster when the table has many indexes.
Use the following procedure:
Optionally create the table with CREATE TABLE. For
example, using mysql or Perl-DBI.
Execute a FLUSH TABLES statement or the shell command
Use myisamchk --keys-used=0 -rq /path/to/db/tbl_name.
This will remove all usage of all indexes from the
Insert data into the table with LOAD DATA INFILE. This
will not update any indexes and will therefore be very
If you are going to only read the table in the future,
run myisampack on it to make it smaller.
Re-create the indexes with myisamchk -r -q /path/to/db/tbl_name.
This will create the index tree in memory before writing
it to disk, which is much faster because it avoids lots
of disk seeks. The resulting index tree is also
perfectly balanced. Execute a FLUSH TABLES statement or
the shell command mysqladmin flush-tables.
This procedure will be built into LOAD DATA INFILE in
some future version of MySQL.
You can speed up insertions by locking your tables:
mysql> LOCK TABLES a WRITE;
mysql> INSERT INTO a VALUES (1,23),(2,34),(4,33);
mysql> INSERT INTO a VALUES (8,26),(6,29);
mysql> UNLOCK TABLES;
The main speed difference is that the index buffer is
flushed to disk only once, after all INSERT statements
have completed. Normally there would be as many index
buffer flushes as there are different INSERT statements.
Locking is not needed if you can insert all rows with a
single statement. Locking will also lower the total time
of multi-connection tests, but the maximum wait time for
some threads will go up (because they wait for locks).
thread 1 does 1000 inserts
thread 2, 3, and 4 does 1 insert
thread 5 does 1000 inserts
If you don't use locking, 2, 3, and 4 will finish before
1 and 5. If you use locking, 2, 3, and 4 probably will
not finish before 1 or 5, but the total time should be
about 40% faster. As INSERT, UPDATE, and DELETE
operations are very fast in MySQL, you will obtain
better overall performance by adding locks around
everything that does more than about 5 inserts or
updates in a row. If you do very many inserts in a row,
you could do a LOCK TABLES followed by an UNLOCK TABLES
once in a while (about each 1000 rows) to allow other
threads access to the table. This would still result in
a nice performance gain. Of course, LOAD DATA INFILE is
much faster for loading data.
To get some more speed for both LOAD DATA INFILE and
INSERT, enlarge the key buffer.
MySQL - Speed of UPDATE Queries ?
Update queries are optimized as a SELECT query with the
additional overhead of a write. The speed of the write
is dependent on the size of the data that is being
updated and the number of indexes that are updated.
Indexes that are not changed will not be updated.
Also, another way to get fast updates is to delay
updates and then do many updates in a row later. Doing
many updates in a row is much quicker than doing one at
a time if you lock the table.
Note that, with dynamic record format, updating a record
to a longer total length may split the record. So if you
do this often, it is very important to OPTIMIZE TABLE
MySQL - Speed of DELETE Queries ?
If you want to delete all rows in the table, you should
use TRUNCATE TABLE table_name.
The time to delete a record is exactly proportional to
the number of indexes. To delete records more quickly,
you can increase the size of the index cache.
MySQL - Other Optimization Tips
Unsorted tips for faster systems:
Use persistent connections to the database to avoid the
connection overhead. If you can't use persistent
connections and you are doing a lot of new connections
to the database, you may want to change the value of the
Always check that all your queries really use the
indexes you have created in the tables. In MySQL you can
do this with the EXPLAIN command.
Try to avoid complex SELECT queries on tables that are
updated a lot. This is to avoid problems with table
locking. The new MyISAM tables can insert rows in a
table without deleted rows at the same time another
table is reading from it. If this is important for you,
you should consider methods where you don't have to
delete rows or run OPTIMIZE TABLE after you have deleted
a lot of rows.
Use ALTER TABLE ... ORDER BY expr1,expr2... if you
mostly retrieve rows in expr1,expr2.. order. By using
this option after big changes to the table, you may be
able to get higher performance.
In some cases it may make sense to introduce a column
that is 'hashed' based on information from other
columns. If this column is short and reasonably unique
it may be much faster than a big index on many columns.
In MySQL it's very easy to use this extra column: SELECT
* FROM table_name WHERE hash=MD5(concat(col1,col2)) AND
col_1='constant' AND col_2='constant' For tables that
change a lot you should try to avoid all VARCHAR or BLOB
columns. You will get dynamic row length as soon as you
are using a single VARCHAR or BLOB column.
It's not normally useful to split a table into different
tables just because the rows gets 'big'. To access a
row, the biggest performance hit is the disk seek to
find the first byte of the row. After finding the data
most new disks can read the whole row fast enough for
most applications. The only cases where it really
matters to split up a table is if it's a dynamic row
size table (see above) that you can change to a fixed
row size, or if you very often need to scan the table
and don't need most of the columns.
If you very often need to calculate things based on
information from a lot of rows (like counts of things),
it's probably much better to introduce a new table and
update the counter in real time. An update of type
UPDATE table set count=count+1 where index_column=constant
is very fast! This is really important when you use
databases like MySQL that only have table locking
(multiple readers / single writers). This will also give
better performance with most databases, as the row
locking manager in this case will have less to do.
If you need to collect statistics from big log tables,
use summary tables instead of scanning the whole table.
Maintaining the summaries should be much faster than
trying to do statistics 'live'. It's much faster to
regenerate new summary tables from the logs when things
change (depending on business decisions) than to have to
change the running application! If possible, one should
classify reports as 'live' or 'statistical', where data
needed for statistical reports are only generated based
on summary tables that are generated from the actual
Take advantage of the fact that columns have default
values. Insert values explicitly only when the value to
be inserted differs from the default. This reduces the
parsing that MySQL need to do and improves the insert
speed. In some cases it's convenient to pack and store
data into a blob. In this case you have to add some
extra code in your appliction to pack/unpack things in
the blob, but this may save a lot of accesses at some
stage. This is practical when you have data that doesn't
conform to a static table structure.
Normally you should try to keep all data non-redundant
(what is called 3rd normal form in database theory), but
you should not be afraid of duplicating things or
creating summary tables if you need these to gain more
Stored procedures or UDF (user-defined functions) may be
a good way to get more performance. In this case you
should, however, always have a way to do this some other
(slower) way if you use some database that doesn't
support this. You can always gain something by caching
queries/answers in your application and trying to do
many inserts/updates at the same time. If your database
supports lock tables (like MySQL and Oracle), this
should help to ensure that the index cache is only
flushed once after all updates.
Use INSERT /*! DELAYED */ when you do not need to know
when your data is written. This speeds things up because
many records can be written with a single disk write.
Use INSERT /*! LOW_PRIORITY */ when you want your
selects to be more important.
Use SELECT /*! HIGH_PRIORITY */ to get selects that jump
the queue. That is, the select is done even if there is
somebody waiting to do a write.
Use the multi-line INSERT statement to store many rows
with one SQL command (many SQL servers supports this).
Use LOAD DATA INFILE to load bigger amounts of data.
This is faster than normal inserts and will be even
faster when myisamchk is integrated in mysqld.
Use AUTO_INCREMENT columns to make unique values.
Use OPTIMIZE TABLE once in a while to avoid
fragmentation when using dynamic table format.
Use HEAP tables to get more speed when possible.
When using a normal Web server setup, images should be
stored as files. That is, store only a file reference in
the database. The main reason for this is that a normal
Web server is much better at caching files than database
contents. So it it's much easier to get a fast system if
you are using files.
Use in memory tables for non-critical data that are
accessed often (like information about the last shown
banner for users that don't have cookies).
Columns with identical information in different tables
should be declared identical and have identical names.
Before Version 3.23 you got slow joins otherwise. Try to
keep the names simple (use name instead of customer_name
in the customer table). To make your names portable to
other SQL servers you should keep them shorter than 18
If you need REALLY high speed, you should take a look at
the low-level interfaces for data storage that the
different SQL servers support! For example, by accessing
the MySQL MyISAM directly, you could get a speed
increase of 2-5 times compared to using the SQL
interface. To be able to do this the data must be on the
same server as the application, and usually it should
only be accessed by one process (because external file
locking is really slow). One could eliminate the above
problems by introducing low-level MyISAM commands in the
MySQL server (this could be one easy way to get more
performance if needed). By carefully designing the
database interface, it should be quite easy to support
this types of optimization. In many cases it's faster to
access data from a database (using a live connection)
than accessing a text file, just because the database is
likely to be more compact than the text file (if you are
using numerical data), and this will involve fewer disk
accesses. You will also save code because you don't have
to parse your text files to find line and column
boundaries. You can also use replication to speed things
Declaring a table with DELAY_KEY_WRITE=1 will make the
updating of indexes faster, as these are not logged to
disk until the file is closed. The downside is that you
should run myisamchk on these tables before you start
mysqld to ensure that they are okay if something killed
mysqld in the middle. As the key information can always
be generated from the data, you should not lose anything
by using DELAY_KEY_WRITE.
Page Numbers : 1