Use INT or GUID for identification field in database?

There are some debate over internet for what to use for identification field in databases – int or guid. As in most cases here isn’t one universal truth. Here are few insights on subject.

  • GUIDs are problematic as clustered keys because of the high randomness.
  • The GUID is going to take up more space and be slower than an int.
  • The advantage of GUIDs is that they are (pretty much) globally unique.
  • When your table is large, and you use selects with JOINS, especially with derived tables, using GUIDs can significally decrease performance.
  • An integer is the native word size for the hardware. It’s about as fast and simple and easy on the computer as a data type gets.
  • When considering possibly using a GUID, know that they make for terrible primary keys. Relational databases in general (I can’t speak for all, but MS SQL is a good example) don’t index GUIDs well. There are hacks out there to try to make more index-friendly GUIDs, take them or leave them. But in general a GUID should be avoided as a PK for performance reasons.
  • GUID allows easy merging of records from different databases.
  • GUID allows easy distribution of databases across multiple servers.
  • GUID can generate IDs anywhere, instead of having to roundtrip to the database
  • Most replication scenarios require GUID columns anyway.
  • GUID whopping 4 times larger than the traditional 4-byte index value; this can have serious performance and storage implications if you’re not careful.
  • GUID is cumbersome to debug (where userid='{BAE7DF4-DDF-3RG-5TY3E3RF456AS10}’).

Further reading 

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *