• ADADADADAD

    MYSQL INNODB主键使用varchar和int的区别是什么[ mysql数据库 ]

    mysql数据库 时间:2024-11-28 13:01:31

    作者:文/会员上传

    简介:

    今天同事讨论关于主键使用varchar和int的区别。我现在总结的3个问题:1、tablespace中空间浪费 当然我们知道使用varchar可能会导致辅助索引比较大,因为用到varchar可能存储的

    以下为本文的正文内容,内容仅供参考!本站为公益性网站,复制本文以及下载DOC文档全部免费。



    今天同事讨论关于主键使用varchar和int的区别。

    我现在总结的3个问题:
    1、tablespace中空间浪费
    当然我们知道使用varchar可能会导致辅助索引比较大,因为用到varchar可能存储的字符较多,同时
    在行头也存在一个可变字段字符区域(1-2)字节
    而辅助索引叶子结点毕竟都存储了主键值,这样至少会多varchar数据字节数量+1(或者2) 字节- 4(int)字节空间。
    如果辅助索引比较多空间浪费是可想而知的。
    2、辅助索引B+树扫描性能
    由于辅助索引B+树的空间要求更大,虽然在B+树层次一般都是3层-4层,索引单值定位I/O消耗并不明显,如果涉及到
    范围查询(比如PAGE_CUR_G),需要访问的块就更多,同时比如例如辅助索引的using index,需要访问的块自然
    更多
    3、比较更加复杂
    innodb 在进行元组比较的时候,不管是DML,select都会涉及到元组的比较,同时回表的时候也涉及
    到比较操作。而varchar类型的比较比int类型更为复杂一些。
    那么我们就来分析第三个问题,第一个问题和第二个问题是显而易见的。
    我这里数据库字符集为latin1\latin1_swedish_ci

    其实在innodb底层进行比较的时候都调用cmp_data这个函数
    在innodb中有自己的定义的数据类型如下:

    点击(此处)折叠或打开

      /*-------------------------------------------*/

      /* The 'MAIN TYPE' of a column */

      #define DATA_MISSING0/* missing column */

      #defineDATA_VARCHAR1/* character varying of the

      latin1_swedish_ci charset-collation; note

      that the MySQL format for this, DATA_BINARY,

      DATA_VARMYSQL, is also affected by whether the

      'precise type' contains

      DATA_MYSQL_TRUE_VARCHAR */

      #define DATA_CHAR2/* fixed length character of the

      latin1_swedish_ci charset-collation */

      #define DATA_FIXBINARY3/* binary string of fixed length */

      #define DATA_BINARY4/* binary string */

      #define DATA_BLOB5/* binary large object, or a TEXT type;

      if prtype & DATA_BINARY_TYPE == 0, then this is

      actually a TEXT column (or a BLOB created

      with < 4.0.14; since column prefix indexes

      came only in 4.0.14, the missing flag in BLOBs

      created before that does not cause any harm) */

      #defineDATA_INT6/* integer: can be any size 1 - 8 bytes */

      #defineDATA_SYS_CHILD7/* address of the child page in node pointer */

      #defineDATA_SYS8/* system column */


    我们熟悉的int类型属于DATA_INT而varchar属于DATA_VARCHAR,rowid属于DATA_SYS
    在函数cmp_data根据各种类型的不同进行了不同比较的方式,这里就将int和varchar
    判断的方式进行说明:
    1、innodb int类型比较
    实际上是在cmp_data中进行了大概的方式如下

    点击(此处)折叠或打开

      if (len) {

      #if defined __i386__ || defined __x86_64__ || defined _M_IX86 || defined _M_X64

      /* Compare the first bytes with a loop to avoid the call

      overhead of memcmp(). On x86 and x86-64, the GCC built-in

      (repz cmpsb) seems to be very slow, so we will be calling the

      libc version. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052

      tracks the slowness of the GCC built-in memcmp().

      We compare up to the first 4..7 bytes with the loop.

      The (len & 3) is used for "normalizing" or

      "quantizing" the len parameter for the memcmp() call,

      in case the whole prefix is equal. On x86 and x86-64,

      the GNU libc memcmp() of equal strings is faster with

      len=4 than with len=3.

      On other architectures than the IA32 or AMD64, there could

      be a built-in memcmp() that is faster than the loop.

      We only use the loop where we know that it can improve

      the performance. */

      for (ulint i = 4 + (len & 3); i > 0; i--) {

      cmp = int(*data1++) - int(*data2++);

      if (cmp) {

      return(cmp);

      }

      if (!--len) {

      break;

      }

      }

      my_strnncollsp_simple

      if (len) {

      #endif /* IA32 or AMD64 */

      cmp = memcmp(data1, data2, len);

      if (cmp) {

      return(cmp);

      }

      data1 += len;

      data2 += len;

      #if defined __i386__ || defined __x86_64__ || defined _M_IX86 || defined _M_X64

      }

      #endif /* IA32 or AMD64 */

      }

      cmp = (int) (len1 - len2);

      if (!cmp || pad == ULINT_UNDEFINED) {

      return(cmp);

      }

    可以看到整个方式比较简洁,对于我们常用的x86_64模型并没有直接使用memcpy进行而是
    进行了优化在注释中也有说明,才出现了for (ulint i = 4 + (len & 3); i > 0; i--)
    部分,如果是IA32 or AMD64则直接使用memcpy进行比较。感兴趣的可以仔细阅读一下

    2、innodb varchar类型比较
    实际上这个比较会通过cmp_data->cmp_whole_field->my_strnncollsp_simple调用最终调用
    my_strnncollsp_simple完成,而比如order by 会调用my_strnxfrm_simple他们都在一个
    文件中。
    下面是整个my_strnncollsp_simple函数

    点击(此处)折叠或打开

      /*

      Compare strings, discarding end space

      SYNOPSIS

      my_strnncollsp_simple()

      cscharacter set handler

      aFirst string to compare

      a_lengthLength of 'a'

      bSecond string to compare

      b_lengthLength of 'b'

      diff_if_only_endspace_difference

      Set to 1 if the strings should be regarded as different

      if they only difference in end space

      IMPLEMENTATION

      If one string is shorter as the other, then we space extend the other

      so that the strings have equal length.

      This will ensure that the following things hold:

      "a" == "a "

      "a\0" < "a"

      "a\0" < "a "

      RETURN

      < 0a < b

      = 0a == b

      > 0a > b

      */

      int my_strnncollsp_simple(const CHARSET_INFO *cs, const uchar *a,

      size_t a_length, const uchar *b, size_t b_length,

      my_bool diff_if_only_endspace_difference)

      {

      const uchar *map= cs->sort_order, *end;

      size_t length;

      int res;

      #ifndef VARCHAR_WITH_DIFF_ENDSPACE_ARE_DIFFERENT_FOR_UNIQUE

      diff_if_only_endspace_difference= 0;

      #endif

      end= a + (length= MY_MIN(a_length, b_length));

      while (a < end)

      {

      if (map[*a++] != map[*b++])

      return ((int) map[a[-1]] - (int) map[b[-1]]);

      }

      res= 0;

      if (a_length != b_length)

      {

      int swap= 1;

      if (diff_if_only_endspace_difference)

      res= 1; /* Assume 'a' is bigger */

      /*

      Check the next not space character of the longer key. If it's < ' ',

      then it's smaller than the other key.

      */

      if (a_length < b_length)

      {

      /* put shorter key in s */

      a_length= b_length;

      a= b;

      swap= -1; /* swap sign of result */

      res= -res;

      }

      for (end= a + a_length-length; a < end ; a++)

      {

      if (map[*a] != map[' '])

      return (map[*a] < map[' ']) ? -swap : swap;

      }

      }

      return res;

      }

    其中*map= cs->sort_order比较关键这是内存中已经存储好的字符集的顺序,
    循环进行
    map[*a++] != map[*b++]
    *a++和*b++ 会得到的字符集编码,然后在整个排序好的字符数组中找,
    则得到了实际字符集编码进行比较,不管是比较的复杂度还是需要比较的
    长度 varchar很可能都远远大于int类型,下面是打印cs->sort_order这片
    内存区域前128字节得到的结果,
    (gdb) x/128bx 0x258b000
    0x258b000 : 0x000x010x020x030x040x050x060x07
    0x258b008 :0x080x090x0a0x0b0x0c0x0d0x0e0x0f
    0x258b010 :0x100x110x120x130x140x150x160x17
    0x258b018 :0x180x190x1a0x1b0x1c0x1d0x1e0x1f
    0x258b020 :0x200x210x220x230x240x250x260x27
    0x258b028 :0x280x290x2a0x2b0x2c0x2d0x2e0x2f
    0x258b030 :0x300x310x320x330x340x350x360x37
    0x258b038 :0x380x390x3a0x3b0x3c0x3d0x3e0x3f
    0x258b040 :0x400x410x420x430x440x450x460x47
    0x258b048 :0x480x490x4a0x4b0x4c0x4d0x4e0x4f
    0x258b050 :0x500x510x520x530x540x550x560x57
    0x258b058 :0x580x590x5a0x5b0x5c0x5d0x5e0x5f
    0x258b060 :0x600x410x420x430x440x450x460x47
    0x258b068 : 0x480x490x4a0x4b0x4c0x4d0x4e0x4f
    0x258b070 : 0x500x510x520x530x540x550x560x57
    0x258b078 : 0x580x590x5a0x7b0x7c0x7d0x7e0x7f
    而从内存的地址0x258b000我们也能看到他确实是存在于堆内存空间中,它是一片堆内存区域。

    下面是varchar比较的调用栈帧以备后用

    #0 my_strnncollsp_simple (cs=0x2d4b9c0, a=0x7fff57a71f93 "gaopeng", a_length=7, b=0x7fffbd7e807f "gaopeng", b_length=7, diff_if_only_endspace_difference=0 '\000')

      at /root/mysql5.7.14/percona-server-5.7.14-7/strings/ctype-simple.c:165

      #1 0x0000000001ab8ec2 in cmp_whole_field (mtype=1, prtype=524303, a=0x7fff57a71f93 "gaopeng", a_length=7, b=0x7fffbd7e807f "gaopeng", b_length=7)

      at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/rem/rem0cmp.cc:374

      #2 0x0000000001aba827 in cmp_data (mtype=1, prtype=524303, data1=0x7fff57a71f93 "gaopeng", len1=7, data2=0x7fffbd7e807f "gaopeng", len2=7)

      at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/rem/rem0cmp.cc:468

      #3 0x0000000001ab9a05 in cmp_dtuple_rec_with_match_bytes (dtuple=0x7fff48ed3280, rec=0x7fffbd7e807f "gaopeng", index=0x7fff48ec78a0, offsets=0x7fff57a6bc50,

      matched_fields=0x7fff57a6bf80, matched_bytes=0x7fff57a6bf78) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/rem/rem0cmp.cc:880

      #4 0x0000000001a87fe2 in page_cur_search_with_match_bytes (block=0x7fffbcceafc0, index=0x7fff48ec78a0, tuple=0x7fff48ed3280, mode=PAGE_CUR_GE,

      iup_matched_fields=0x7fff57a6cdf8, iup_matched_bytes=0x7fff57a6cdf0, ilow_matched_fields=0x7fff57a6cde8, ilow_matched_bytes=0x7fff57a6cde0, cursor=0x7fff57a713f8)

      at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/page/page0cur.cc:850

      #5 0x0000000001c17a3e in btr_cur_search_to_nth_level (index=0x7fff48ec78a0, level=0, tuple=0x7fff48ed3280, mode=PAGE_CUR_GE, latch_mode=1, cursor=0x7fff57a713f0,

      has_search_latch=0, file=0x2336938 "/root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/btr/btr0cur.cc", line=5744, mtr=0x7fff57a70ee0)

      at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/btr/btr0cur.cc:1478

      #6 0x0000000001c222bf in btr_estimate_n_rows_in_range_low (index=0x7fff48ec78a0, tuple1=0x7fff48ed3280, mode1=PAGE_CUR_GE, tuple2=0x7fff48ed32e0, mode2=PAGE_CUR_G,

      nth_attempt=1) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/btr/btr0cur.cc:5744

      #7 0x0000000001c22a09 in btr_estimate_n_rows_in_range (index=0x7fff48ec78a0, tuple1=0x7fff48ed3280, mode1=PAGE_CUR_GE, tuple2=0x7fff48ed32e0, mode2=PAGE_CUR_G)

      at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/btr/btr0cur.cc:6044

      #8 0x00000000019b3e0e in ha_innobase::records_in_range (this=0x7fff48e7e3b0, keynr=1, min_key=0x7fff57a71680, max_key=0x7fff57a716a0)

      at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/handler/ha_innodb.cc:13938

      #9 0x0000000000f6ed5b in handler::multi_range_read_info_const (this=0x7fff48e7e3b0, keyno=1, seq=0x7fff57a71b90, seq_init_param=0x7fff57a71850, n_ranges_arg=0,

      bufsz=0x7fff57a71780, flags=0x7fff57a71784, cost=0x7fff57a71d10) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/handler.cc:6440

      #10 0x0000000000f70662 in DsMrr_impl::dsmrr_info_const (this=0x7fff48e7e820, keyno=1, seq=0x7fff57a71b90, seq_init_param=0x7fff57a71850, n_ranges=0,

      bufsz=0x7fff57a71d70, flags=0x7fff57a71d74, cost=0x7fff57a71d10) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/handler.cc:7112

      #11 0x00000000019be22f in ha_innobase::multi_range_read_info_const (this=0x7fff48e7e3b0, keyno=1, seq=0x7fff57a71b90, seq_init_param=0x7fff57a71850, n_ranges=0,

      bufsz=0x7fff57a71d70, flags=0x7fff57a71d74, cost=0x7fff57a71d10) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/innobase/handler/ha_innodb.cc:21351

      #12 0x000000000178c9e4 in check_quick_select (param=0x7fff57a71e30, idx=0, index_only=false, tree=0x7fff48e700e0, update_tbl_stats=true, mrr_flags=0x7fff57a71d74,

      bufsize=0x7fff57a71d70, cost=0x7fff57a71d10) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/opt_range.cc:10030

      #13 0x0000000001783305 in get_key_scans_params (param=0x7fff57a71e30, tree=0x7fff48e70058, index_read_must_be_used=false, update_tbl_stats=true,

      cost_est=0x7fff57a74190) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/opt_range.cc:5812

      #14 0x000000000177ce43 in test_quick_select (thd=0x7fff4801f4d0, keys_to_use=..., prev_tables=0, limit=18446744073709551615, force_quick_range=false,

      interesting_order=st_order::ORDER_NOT_RELEVANT, tab=0x7fff48eacf20, cond=0x7fff48eacd50, needed_reg=0x7fff48eacf60, quick=0x7fff57a744c8)

      at /root/mysql5.7.14/percona-server-5.7.14-7/sql/opt_range.cc:3066

      #15 0x000000000158b9bc in get_quick_record_count (thd=0x7fff4801f4d0, tab=0x7fff48eacf20, limit=18446744073709551615)

      at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_optimizer.cc:5942

      #16 0x000000000158b073 in JOIN::estimate_rowcount (this=0x7fff48eac980) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_optimizer.cc:5689

      #17 0x00000000015893b5 in JOIN::make_join_plan (this=0x7fff48eac980) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_optimizer.cc:5046

      #18 0x000000000157d9b7 in JOIN::optimize (this=0x7fff48eac980) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_optimizer.cc:387

      #19 0x00000000015fab71 in st_select_lex::optimize (this=0x7fff48aa45c0, thd=0x7fff4801f4d0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_select.cc:1009

      #20 0x00000000015f9284 in handle_query (thd=0x7fff4801f4d0, lex=0x7fff48021ab0, result=0x7fff48aa5dc8, added_options=0, removed_options=0)

      at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_select.cc:164

      #21 0x00000000015ac159 in execute_sqlcom_select (thd=0x7fff4801f4d0, all_tables=0x7fff48aa54b8) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:5391

      #22 0x00000000015a4774 in mysql_execute_command (thd=0x7fff4801f4d0, first_level=true) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:2889

      #23 0x00000000015ad12a in mysql_parse (thd=0x7fff4801f4d0, parser_state=0x7fff57a76600) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:5836

      #24 0x00000000015a0fe9 in dispatch_command (thd=0x7fff4801f4d0, com_data=0x7fff57a76d70, command=COM_QUERY)

      at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:1447

      #25 0x000000000159fe1a in do_command (thd=0x7fff4801f4d0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:1010

      #26 0x00000000016e1d6c in handle_connection (arg=0x6320740) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/conn_handler/connection_handler_per_thread.cc:312

      ---Type <return> to continue, or q <return> to quit---

      #27 0x0000000001d723f4 in pfs_spawn_thread (arg=0x6320530) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/perfschema/pfs.cc:2188

      #28 0x0000003ca62079d1 in start_thread () from /lib64/libpthread.so.0

      #29 0x0000003ca5ee8b6d in clone () from /lib64/libc.so.6

    MYSQL INNODB主键使用varchar和int的区别是什么.docx

    将本文的Word文档下载到电脑

    推荐度:

    下载
    热门标签: innodbmysql