sql命令

最新推荐文章于 2025-05-19 14:06:07 发布

原创最新推荐文章于 2025-05-19 14:06:07 发布 · 978 阅读

0 ·

CC 4.0 BY-SA版权

数据库专栏收录该内容

1 篇文章

订阅专栏

1.选择表中的某些字段且某字段的值不等于特定字符串

select 字段1,字段2 from 表名 where 字段2 not like '%特定字符串%'

2.计算表中有多少数据

select count(列名，一列就行) from 表名

3.limit指定返回多少条数据
在sql语句最后加上limit A,B，表示从A开始抽取满足条件的数据B条，当不够B条时显示真实的数量。

select 字段1,字段2 from 表名 where 字段2 not like '%特定字符串%' limit A,B

注：limit越到后面越慢。select* from table limit 0,10扫描满足条件的10行，返回10行，但当limit 24355，20的时候数据读取就很慢,limit 24355，20的意思扫描满足条件的24355行，扔掉前面的24355行，返回最后的20行。
解决方法：
利用覆盖索引，只包含id列，如下：

select id from table limit 24355,10

如果要获取所有列：子查询，id>=的形式

select * from table
where id>=(select id from table limit 24355,1) limit 20

关联查询，利用join

select * from table a
join(select id from table limit 24355,10) b on a.id = b.id

子查询，关联查询性能对比：
①执行子查询时需要创建临时表，查询完之后再删除，所以子查询的查询速度会受到一定的影响，多了一个创建和销毁临时表的过程。
②关联查询不需要建立临时表，注意，能过滤先过滤，过滤好了再连接。
4.按某个字段去重，查询多个字段
表结构：id唯一，age列有重复的，按age去重，查询所有字段
按age去重查询多个字段:

select * from table1 where id in (select max(id) from table1 group by age)

5.从mysql中读取数据（分批读取）
前言
我需要从一个特别大的表中取到数据并且按照某个字段的数据去重，下面的两个方法，最终采取了第一种，考虑到要去重，所以一行一行的读入数据全部存起来再去重的话，去重的复杂度太高，使用第一种，每读取1000条数据对这1000条进行去重。
①使用limit:连接数据库，并按照一定的查询语句查询数据,使用limit按批次读入，并将结果写入txt文件：

import pymysql
class Sql_df(object):
    def __init__(self, input_db):
        self.host = '10.0.2.34'
        self.port = 3306
        self.username = 'root'
        self.password = 'root'
        self.input_db = input_db
        self.conn = pymysql.connect(host=self.host, port=self.port, user=self.username, passwd=self.password,
                                    db=self.input_db, charset='utf8')

    def sql_input_batch(self, sql_state, nums_sample, batch_size):
        sens = []
        result = []
        batches = nums_sample // batch_size
        cur = self.conn.cursor()
        for i in range(batches):
            cur.execute(sql_state + ' limit ' + str(i * batch_size) + ',' + str(batch_size))
            tmp_list = list(cur.fetchall())
            # 去重,特殊字段筛选
            for i in tmp_list:
                # 特殊字段筛选
                # tag = self.up_select(i[2], i[3])
                if tag:
                    if i[2] not in sens:
                        sens.append(i[2])
                        result.append(tuple(i))
            print(len(tmp_list))
        print(len(result))
        # 写入txt文件
        with open("data.txt", "w", encoding="utf-8") as f:
            for i in result:
                f.write("{}\t{}\t{}\t{}".format(i[0],i[1],i[2],i[3]))
                f.write("\n")

if __name__ == '__main__':
    data_input = Sql_df('table1')# 表名
    # 查询语句
    data_input.sql_input_batch("select id,source_table,source_column,column_data from table1 where source_table not like '%medicine_order%'",
                               2902940,1000)

②使用SSCursor (流式游标)
解决 Python 使用 pymysql 查询大量数据导致内存使用过高的问题，这个 cursor 实际上没有缓存下来任何数据，它不会读取所有数据到内存中，它的做法是从储存块中读取记录，并且一条一条返回给你。

dbmy = pymysql.connect("ip", "user", "pass", "date", cursorclass=pymysql.cursors.SSCursor)
cursor = dbmy.cursor()
sql = "select * from table"
relnum = cursor.execute(sql)
result = cursor.fetchone()
while result is not None:
    result = cursor.fetchone()
cursor.close()
dbmy.close()