Re: Row Limit on tables

Lists: pgsql-general
From: "shey sewani" <pakix2000(at)hotmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Row Limit on tables
Date: 2002-05-31 15:25:48
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-general

Is there a limit on the number of rows you can have in a table?
Couple experts have predicted that 20Million or 200Million was the limit. I
would like to hear your expereinces.

My machine is a Redhat 7.1 x86 Box running Postgres7.1.2

Thank you

Sheheryar Sewani
Houston, TX

_________________________________________________________________
Join the worlds largest e-mail service with MSN Hotmail.
http://www.hotmail.com


From: Devrim GUNDUZ <devrim(at)oper(dot)metu(dot)edu(dot)tr>
To: shey sewani <pakix2000(at)hotmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Row Limit on tables
Date: 2002-05-31 16:15:19
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-general


Hi,

On Fri, 31 May 2002, shey sewani wrote:

>
>
> Is there a limit on the number of rows you can have in a table?
> Couple experts have predicted that 20Million or 200Million was the limit. I
> would like to hear your expereinces.
>
> My machine is a Redhat 7.1 x86 Box running Postgres7.1.2

I cannot remember the definition exactly; but as far as I remember, it was
with count() function; which returns 32-bit integer, and for tables having
more than 2 billion rows, count() has really no "meaning".

That's why numbger of rows in a table is only limited to your available
physical disk space or swap.

Best regards.
--

Devrim GUNDUZ

devrim(at)oper(dot)metu(dot)edu(dot)tr
devrim(dot)gunduz(at)linux(dot)org(dot)tr

Web : http://devrim.oper.metu.edu.tr
-------------------------------------


From: Curt Sampson <cjs(at)cynic(dot)net>
To: shey sewani <pakix2000(at)hotmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Row Limit on tables
Date: 2002-05-31 17:22:59
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-general

On Fri, 31 May 2002, shey sewani wrote:

> Is there a limit on the number of rows you can have in a table?
> Couple experts have predicted that 20Million or 200Million was the limit. I
> would like to hear your expereinces.

Nowhere near that low, that's for sure. I'm playing with a 500
million row table right now. And this is just on a single cheap
IDE disk, not on real hardware that would let me work with properly
large databases....

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "shey sewani" <pakix2000(at)hotmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Row Limit on tables
Date: 2002-05-31 17:39:37
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-general

"shey sewani" <pakix2000(at)hotmail(dot)com> writes:
> Is there a limit on the number of rows you can have in a table?

http://www.ca.postgresql.org/docs/faq-english.html#4.5

The smallest useful row size would probably be 40 bytes (2 int columns
plus 32 bytes overhead), so 16TB/40 = 400 billion rows is the most you
could possibly fit given the table size limit (unless you raise BLCKSZ
to 32K, which would give you another factor of 4).

In practice, "what will fit on your disk" is the limit.

> Couple experts have predicted that 20Million or 200Million was the
> limit.

These "experts" evidently have no clue what they are talking about.
Perhaps they neglected to read the thread running parallel to this
one that mentions throwaway tests on 500M-row tables...

I saw nearby a claim that count(*) would fail at 2G rows --- but
count(*) returns int8 in 7.2 and later.

regards, tom lane


From: Curt Sampson <cjs(at)cynic(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: shey sewani <pakix2000(at)hotmail(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Row Limit on tables
Date: 2002-05-31 18:25:50
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-general

On Fri, 31 May 2002, Tom Lane wrote:

> In practice, "what will fit on your disk" is the limit.

Actually, not even. Or so I think for most cases.

Playing around with my little 500M row table, I notice that the table
itself is only 25 GB, and even with 4 indices, the whole shebang is
less than 60 GB. What's killing me is disk access speed. Currently, my
"standard" query is 70 sec., entirely due to limitations on the number
of random I/Os I can do per second.

I can fairly cheaply halve this problem by striping the database across
two disks, but then I double the space available. If that leads me to
double the database size, I'm back in the same hole I was in before,
maybe worse.

It's way, way too easy these days to run up a terrabyte of RAID-5
storage. One Escalade 7850 controller ($500) plus eight 160 GB drives
($250 each) sets you out about $2500. But the problem is, all this
storage often doesn't have the I/O bandwidth you need actually to make
use of it....

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC


From: "Steve Wolfe" <steve(at)iboats(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Row Limit on tables
Date: 2002-05-31 18:57:16
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-general

> > In practice, "what will fit on your disk" is the limit.
>
> Actually, not even. Or so I think for most cases.

(snip)

> I can fairly cheaply halve this problem by striping the database across
> two disks, but then I double the space available. If that leads me to
> double the database size, I'm back in the same hole I was in before,
> maybe worse.

You seem to be confusing the question "How many rows can I have in a
table?" with "How fast will query (X) run on a table with (Y) rows?"

The question and documents we're talking about are dealing with hard
ceilings, not with performance. In fact, trying to give performance
estimates for such situations is kind of silly - queries on such tables
could be anywhere from very, very fast to very, very slow. Many, many
factors are involved.

> It's way, way too easy these days to run up a terrabyte of RAID-5
> storage. One Escalade 7850 controller ($500) plus eight 160 GB drives
> ($250 each) sets you out about $2500. But the problem is, all this
> storage often doesn't have the I/O bandwidth you need actually to make
> use of it....

That's one extreme end of the spectrum, why not look at the other? You
could load up on low-latency, 10K RPM 9-gig drives, and have amazing
throughput with only a very small fraction of the total storage capacity.
It's all in what you're looking for.

steve


From: Curt Sampson <cjs(at)cynic(dot)net>
To: Steve Wolfe <steve(at)iboats(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Row Limit on tables
Date: 2002-06-01 06:09:16
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-general

On Fri, 31 May 2002, Steve Wolfe wrote:

> You seem to be confusing the question "How many rows can I have in a
> table?" with "How fast will query (X) run on a table with (Y) rows?"

IMHO, they're part of the same question. Given unlimited time, sure
your table size limit is disk space. But it seems to me, from
looking at the conversations here, that people don't care to wait
unlimited time for the results of a query.

> That's one extreme end of the spectrum, why not look at the other? You
> could load up on low-latency, 10K RPM 9-gig drives, and have amazing
> throughput with only a very small fraction of the total storage capacity.

I'm not convinced that the thoughput would be all that much better,
in fact, given an equivalant number of drives. I'd be interested
in hearing about the results of tests that show otherwise.

How many I/Os per second does a good modern 10K RPM SCSI drive do
these days, anyway?

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC


From: Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my>
To: Curt Sampson <cjs(at)cynic(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: shey sewani <pakix2000(at)hotmail(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Row Limit on tables
Date: 2002-06-01 08:19:54
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-general

At 03:25 AM 6/1/02 +0900, Curt Sampson wrote:
>It's way, way too easy these days to run up a terrabyte of RAID-5
>storage. One Escalade 7850 controller ($500) plus eight 160 GB drives
>($250 each) sets you out about $2500. But the problem is, all this
>storage often doesn't have the I/O bandwidth you need actually to make
>use of it....

Use smaller HDDs, or if cost is not a big issue just use the faster
portions of the big HDDs.

The smaller HDDs are about the same speed as the big ones of the same model
range- the bigger ones just have more platters.

But the bigger ones allow you to throw away slower space. By using just the
faster portions (the outer edges) you can reduce seek time by a few
milliseconds, and keep throughput in the higher bands.

Cheerio,
Link.


From: Curt Sampson <cjs(at)cynic(dot)net>
To: Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Row Limit on tables
Date: 2002-06-01 10:02:31
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-general

On Sat, 1 Jun 2002, Lincoln Yeoh wrote:

> The smaller HDDs are about the same speed as the big ones of the same model
> range- the bigger ones just have more platters.

Used to be the case. It's not so often so any more.

> But the bigger ones allow you to throw away slower space. By using just the
> faster portions (the outer edges) you can reduce seek time by a few
> milliseconds, and keep throughput in the higher bands.

Right. That still works!

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC


From: Doug Fields <dfields-pg-general(at)pexicom(dot)com>
To: Devrim GUNDUZ <devrim(at)oper(dot)metu(dot)edu(dot)tr>
Cc: shey sewani <pakix2000(at)hotmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Row Limit on tables
Date: 2002-06-02 00:04:39
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-general


>I cannot remember the definition exactly; but as far as I remember, it was
>with count() function; which returns 32-bit integer, and for tables having
>more than 2 billion rows, count() has really no "meaning".

In 7.2.1 the aggregate functions now return longs (64-bit values).

Cheers,

Doug


From: Doug Fields <dfields-pg-general(at)pexicom(dot)com>
To: Curt Sampson <cjs(at)cynic(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, shey sewani <pakix2000(at)hotmail(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Row Limit on tables
Date: 2002-06-02 00:07:52
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-general


>It's way, way too easy these days to run up a terrabyte of RAID-5
>storage. One Escalade 7850 controller ($500) plus eight 160 GB drives
>($250 each) sets you out about $2500. But the problem is, all this
>storage often doesn't have the I/O bandwidth you need actually to make
>use of it....

FYI... The Escalade controller configured as you suggest here is known to
do 125-145mb/sec at least... Much faster than any single IDE. (I use
Escalades until very recently when I had to switch to SCSI for just that
last bit more of performance.)

Cheers,

Doug


From: Curt Sampson <cjs(at)cynic(dot)net>
To: Doug Fields <dfields-pg-general(at)pexicom(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Row Limit on tables
Date: 2002-06-02 07:21:22
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-general

On Sat, 1 Jun 2002, Doug Fields wrote:

> FYI... The Escalade controller configured as you suggest here is known to
> do 125-145mb/sec at least...

When doing database stuff, you're most likely to run into the number
of I/Os per second limit long before the bandwidth limit.

On my queries, an IDE drive perfectly cable of transferring 30
MB/sec. ends up transferring more like two or three, because it
can't do more than about 140 reads per second, and the reads are
only 16K.

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC