Add TODO detail directory.

957e6a69 · Bruce Momjian · 75596775 · 957e6a69 · 957e6a69 · 957e6a69
19 changed file
--- a/doc/TODO.detail/README
+++ b/doc/TODO.detail/README
+These files are in standard Unix mailbox format, and are detail
+information related to the TODO list.
--- a/doc/TODO.detail/alpha
+++ b/doc/TODO.detail/alpha
+From owner-pgsql-hackers@hub.org Fri May 14 16:00:46 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA02173
+	for <maillist@candle.pha.pa.us>; Fri, 14 May 1999 16:00:44 -0400 (EDT)
+Received: from hub.org (hub.org [209.167.229.1]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id QAA02824 for <maillist@candle.pha.pa.us>; Fri, 14 May 1999 16:00:45 -0400 (EDT)
+Received: from hub.org (hub.org [209.167.229.1])
+	by hub.org (8.9.3/8.9.3) with ESMTP id PAA47798;
+	Fri, 14 May 1999 15:57:54 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 14 May 1999 15:54:30 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.9.3/8.9.3) id PAA47191
+	for pgsql-hackers-outgoing; Fri, 14 May 1999 15:54:28 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from thelab.hub.org (nat194.147.mpoweredpc.net [142.177.194.147])
+	by hub.org (8.9.3/8.9.3) with ESMTP id PAA46457
+	for <pgsql-hackers@postgresql.org>; Fri, 14 May 1999 15:49:35 -0400 (EDT)
+	(envelope-from scrappy@hub.org)
+Received: from localhost (scrappy@localhost)
+	by thelab.hub.org (8.9.3/8.9.1) with ESMTP id QAA16128;
+	Fri, 14 May 1999 16:49:44 -0300 (ADT)
+	(envelope-from scrappy@hub.org)
+X-Authentication-Warning: thelab.hub.org: scrappy owned process doing -bs
+Date: Fri, 14 May 1999 16:49:44 -0300 (ADT)
+From: The Hermit Hacker <scrappy@hub.org>
+To: pgsql-hackers@postgreSQL.org
+cc: Jack Howarth <howarth@nitro.med.uc.edu>
+Subject: [HACKERS] postgresql bug report (fwd)
+Message-ID: <Pine.BSF.4.05.9905141649150.47191-100000@thelab.hub.org>
+MIME-Version: 1.0
+Content-Type: TEXT/PLAIN; charset=US-ASCII
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: ROr
+
+
+Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
+Systems Administrator @ hub.org 
+primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
+
+---------- Forwarded message ----------
+Date: Fri, 14 May 1999 14:50:58 -0400
+From: Jack Howarth <howarth@nitro.med.uc.edu>
+To: scrappy@hub.org
+Subject: postgresql bug report
+
+Marc,
+      In porting the RedHat 6.0 srpm set for a linuxppc release we
+believe a bug has been identified in
+the postgresql source for 6.5-0.beta1. Our development tools are as
+follows...
+
+glibc 2.1.1 pre 2
+linux 2.2.6
+egcs 1.1.2
+the latest binutils snapshot
+
+The bug that we see is that when egcs compiles postgresql at -O1 or
+higher (-O0 is fine),
+postgresql creates incorrectly formed databases such that when the user
+does a destroydb
+the database can not be destroyed. Franz Sirl has identified the problem
+as follows...
+
+    it seems that this problem is a type casting/promotion bug in the
+source. The
+    routine _bt_checkkeys() in backend/access/nbtree/nbtutils.c calls
+int2eq() in
+    backend/utils/adt/int.c via a function pointer
+*fmgr_faddr(&key[0].sk_func). As
+    the type information for int2eq is lost via the function pointer,
+the compiler
+    passes 2 ints, but int2eq expects 2 (preformatted in a 32bit reg)
+int16's.
+    This particular bug goes away, if I for example change int2eq to:
+
+    bool
+    int2eq(int32 arg1, int32 arg2)
+    {
+            return (int16)arg1 == (int16)arg2;
+    }
+
+    This moves away the type casting/promotion "work" from caller to the
+callee and
+    is probably the right thing to do for functions used via function
+pointers.
+
+...because of the large number of changes required to do this, Franz
+thought we should
+pass this on to the postgresql maintainers for correction. Please feel
+free to contact
+Franz Sirl (Franz.Sirl-kernel@lauterbach.com) if you have any questions
+on this bug
+report.
+
+--
+------------------------------------------------------------------------------
+Jack W. Howarth, Ph.D.                                     231 Bethesda Avenue
+NMR Facility Director                              Cincinnati, Ohio 45267-0524
+Dept. of Molecular Genetics                              phone: (513) 558-4420
+Univ. of Cincinnati College of Medicine                    fax: (513) 558-8474
+
+
+
+
+
+
--- a/doc/TODO.detail/arrays
+++ b/doc/TODO.detail/arrays
+From owner-pgsql-hackers@hub.org Wed Nov 25 19:01:02 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA16399
+	for <maillist@candle.pha.pa.us>; Wed, 25 Nov 1998 19:01:01 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id SAA05250 for <maillist@candle.pha.pa.us>; Wed, 25 Nov 1998 18:53:12 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.1/8.9.1) with SMTP id SAA17798;
+	Wed, 25 Nov 1998 18:49:38 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 25 Nov 1998 18:49:07 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.1/8.9.1) id SAA17697
+	for pgsql-hackers-outgoing; Wed, 25 Nov 1998 18:49:06 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from mail.enterprise.net (root@mail.enterprise.net [194.72.192.18])
+	by hub.org (8.9.1/8.9.1) with ESMTP id SAA17650;
+	Wed, 25 Nov 1998 18:48:55 -0500 (EST)
+	(envelope-from olly@lfix.co.uk)
+Received: from linda.lfix.co.uk (root@max01-040.enterprise.net [194.72.197.40])
+	by mail.enterprise.net (8.8.5/8.8.5) with ESMTP id XAA20539;
+	Wed, 25 Nov 1998 23:48:52 GMT
+Received: from linda.lfix.co.uk (olly@localhost [127.0.0.1])
+	by linda.lfix.co.uk (8.9.1a/8.9.1/Debian/GNU) with ESMTP id XAA12089;
+	Wed, 25 Nov 1998 23:48:52 GMT
+Message-Id: <199811252348.XAA12089@linda.lfix.co.uk>
+X-Mailer: exmh version 2.0.2 2/24/98 (debian) 
+X-URL: http://www.lfix.co.uk/oliver
+X-face: "xUFVDj+ZJtL_IbURmI}!~xAyPC"Mrk=MkAm&tPQnNq(FWxv49R}\>0oI8VM?O2VY+N7@F-
+	KMLl*!h}B)u@TW|B}6<X<J|}QsVlTi:RA:O7Abc(@D2Y/"J\S,b1!<&<B/J}b.Ii9@B]H6V!+#sE0Q
+	_+=`K$5TI|4I0-=Cp%pt~L#QYydO'iBXR~\tT?uftep9n9AF`@SzTwsw6uqJ}pL,h(cZi}T#PB"#!k
+	p^e=Z.K~fuw$l?]lUV)?R]U}l;f*~Ol)#fpKR)Yt}XOr6BI\_Jjr0!@GMnpCTnTym4f;c{;Ms=0{`D
+	Lq9MO6{wj%s-*N"G,g
+To: bugs@postgreSQL.org, hackers@postgreSQL.org
+Subject: [HACKERS] Failures with arrays
+Mime-Version: 1.0
+Content-Type: text/plain; charset=us-ascii
+Date: Wed, 25 Nov 1998 23:48:51 +0000
+From: "Oliver Elphick" <olly@lfix.co.uk>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: ROr
+
+This was reported as a bug with the Debian package of 6.3.2; the same
+behaviour is still present in 6.4. 
+
+bray=> create table foo ( t text[]);
+CREATE
+bray=> insert into foo values ( '{"a"}');
+INSERT 201354 1
+bray=> insert into foo values ( '{"a","b"}');
+INSERT 201355 1
+bray=>  insert into foo values ( '{"a","b","c"}');
+INSERT 201356 1
+bray=>  select * from foo;
+t            
+-------------
+{"a"}        
+{"a","b"}    
+{"a","b","c"}
+(3 rows)
+
+bray=> select t[1] from foo;
+ERROR:  type name lookup of t failed
+bray=> select * from foo;
+t            
+-------------
+{"a"}        
+{"a","b"}    
+{"a","b","c"}
+(3 rows)
+
+bray=> select foo.t[1] from foo;
+t
+-
+a
+a
+a
+(3 rows)
+
+bray=> select count(foo.t[1]) from foo;
+pqReadData() -- backend closed the channel unexpectedly.
+
+-- 
+Oliver Elphick                                Oliver.Elphick@lfix.co.uk
+Isle of Wight                              http://www.lfix.co.uk/oliver
+               PGP key from public servers; key ID 32B8FAA1
+                 ========================================
+     "Let us therefore come boldly unto the throne of grace,
+      that we may obtain mercy, and find grace to help in 
+      time of need."             Hebrews 4:16 
+
+
+
+
--- a/doc/TODO.detail/cnfify
+++ b/doc/TODO.detail/cnfify
+From daybee@bellatlantic.net Sun Aug 23 20:21:48 1998
+Received: from iconmail.bellatlantic.net (iconmail.bellatlantic.net [199.173.162.30])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id UAA26688
+	for <maillist@candle.pha.pa.us>; Sun, 23 Aug 1998 20:21:46 -0400 (EDT)
+Received: from bellatlantic.net (client196-126-169.bellatlantic.net [151.196.126.169])
+	by iconmail.bellatlantic.net (IConNet Sendmail) with ESMTP id UAA09478;
+	Sun, 23 Aug 1998 20:18:35 -0400 (EDT)
+Message-ID: <35E0ABF0.578694C8@bellatlantic.net>
+Date: Sun, 23 Aug 1998 19:55:29 -0400
+From: David Hartwig <daybee@bellatlantic.net>
+Organization: Home
+X-Mailer: Mozilla 4.04 [en] (Win95; I)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+CC: hannu@trust.ee, pgsql-interfaces@postgreSQL.org, hackers@postgreSQL.org
+Subject: Re: [INTERFACES] Re: [HACKERS] changes in 6.4
+References: <199808220353.XAA04528@candle.pha.pa.us>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: ROr
+
+
+
+Bruce Momjian wrote:
+
+> >
+> > Hannu Krosing wrote:
+> >
+> > > > The days where every release fixed server crashes, or added a feature
+> > > > that users were 'screaming for' may be a thing of the past.
+> > >
+> > > Is anyone working on fixing the exploding optimisations for many OR-s,
+> > > at least the canonic case used by access?
+> > >
+> > > My impression is that this has fallen somewhere between
+> > > insightdist and Vadim.
+> >
+> > This is really big for the ODBCers. (And I suspect for JDBCers too.)  Many
+> > desktop libraries and end-user tools depend on this "record set" strategy to
+> > operate effectively.
+> >
+> > I have put together a workable hack that runs just before cnfify().  The
+> > option is activated through the SET command.  Once activated, it identifies
+> > queries with this particular multi-OR pattern generated by these RECORD SET
+> > strategies.  Qualified query trees are rewritten as multiple UNIONs.   (One
+> > for each OR grouping).
+> >
+> > The results are profound.    Queries that used to scan tables because of the
+> > ORs, now make use of any indexes.   Thus, the size of the table has virtually
+> > no effect on performance.  Furthermore, queries that used to crash the
+> > backend, now run in under a second.
+> >
+> > Currently the down sides are:
+> >     1. If there is no usable index, performance is significantly worse.  The
+> > patch does not check to make sure that there is a usable index.  I could use
+> > some pointers on this.
+> >
+> >     2. Small tables are actually a bit slower than without the patch.
+> >
+> >     3.  Not very elegant.    I am looking for a more generalized solution.
+> > I have lots of ideas, but I would need to know the backend much better before
+> > attempting any of them.   My favorite idea is before cnfify(), to factor the
+> > OR terms and pull out the constants into a virtual (temporary) table spaces.
+> > Then rewrite the query as a join.   The optimizer will (should) treat the new
+> > query accordingly.  This assumes that an efficient factoring algorithm exists
+> > and that temporary tables can exist in the heap.
+> >
+> > Illustration:
+> > SELECT ... FROM tab WHERE
+> > (var1 = const1 AND var2 = const2) OR
+> > (var1 = const3 AND var2 = const4) OR
+> > (var1 = const5 AND var2 = const6)
+> >
+> > SELECT ... FROM tab, tmp WHERE
+> > (var1 = var_x AND var2 = var_y)
+> >
+> > tmp
+> > var_x  | var_y
+> > --------------
+> > const1|const2
+> > const3|const4
+> > const5|const6
+>
+> David, where are we on this?  I know we have OR's using indexes.  Do we
+> still need to look this as a fix, or are we OK.   I have not gotten far
+> enough in the optimizer to know how to fix the
+
+Bruce,
+
+If the question is, have I come up with a solution for the cnf'ify problem:  No
+
+If the question is, is it still important:  Very much yes.
+
+It is essential for many RAD tools using remote data objects which make use of key
+sets.  Your recent optimization of the OR list goes a long way, but inevitably
+users are confronted with multi-part keys.
+
+When I look at the problem my head spins.   I do not have the experience (yet?)
+with the backend to be mucking around in the optimizer.  As I see it, cnf'ify is
+doing just what it is supposed to do.  Boundless boolean logic.
+
+I think hope may lay though, in identifying each AND'ed group associated with a key
+and tagging it as a special sub-root node which cnf'ify does not penetrate.   This
+node would be allowed to pass to the later stages of the optimizer where it will be
+used to plan index scans.  Easy for me to say.
+
+In the meantime, I still have the patch that I described in prior email.  It has
+worked well for us.  Let me restate that.   We could not survive without it!
+However, I do not feel that is a sufficiently functional approach that should be
+incorporated as a final solution.     I will submit the patch if you, (anyone) does
+not come up with a better solution.  It is coded to be activated by a SET KSQO to
+minimize its reach.
+
+
+From daybee@bellatlantic.net Sun Aug 30 12:06:24 1998
+Received: from iconmail.bellatlantic.net (iconmail.bellatlantic.net [199.173.162.30])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id MAA12860
+	for <maillist@candle.pha.pa.us>; Sun, 30 Aug 1998 12:06:22 -0400 (EDT)
+Received: from bellatlantic.net (client196-126-73.bellatlantic.net [151.196.126.73])
+	by iconmail.bellatlantic.net (IConNet Sendmail) with ESMTP id MAA18468;
+	Sun, 30 Aug 1998 12:03:33 -0400 (EDT)
+Message-ID: <35E9726E.C6E73049@bellatlantic.net>
+Date: Sun, 30 Aug 1998 11:40:31 -0400
+From: David Hartwig <daybee@bellatlantic.net>
+Organization: Home
+X-Mailer: Mozilla 4.06 [en] (Win98; I)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+CC: hannu@trust.ee, pgsql-interfaces@postgreSQL.org, hackers@postgreSQL.org
+Subject: Re: [INTERFACES] Re: [HACKERS] changes in 6.4
+References: <199808290344.XAA28089@candle.pha.pa.us>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: RO
+
+
+
+Bruce Momjian wrote:
+
+> OK, let me try this one.
+>
+> Why is the system cnf'ifying the query.  Because it  wants to have a
+> list of qualifications that are AND'ed, so it can just pick the most
+> restrictive/cheapest, and evaluate that one first.
+>
+> If you have:
+>
+>         (a=b and c=d) or e=1
+>
+> In this case, without cnf'ify, it has to evaluate both of them, because
+> if one is false, you can't be sure another would be true.  In the
+> cnf'ify case,
+>
+>         (a=b or e=1) and (c=d or e=1)
+>
+> In this case, it can choose either, and act on just one, if a row fails
+> to meet it, it can stop and not evaluate it using the other restriction.
+>
+> The fact is that it is only going to use fancy join/index in one of the
+> two cases, so it tries to pick the best one, and does a brute-force
+> qualification test on the remaining item if the first one tried is true.
+>
+> The problem is of course large where clauses can exponentially expand
+> this.  What it really trying to do is to pick a cheapest restriction,
+> but the memory explosion and query failure are serious problems.
+>
+> The issue is that it thinks it is doing something to help things, while
+> it is actually hurting things.
+>
+> In the ODBC case of:
+>
+>         (x=3 and y=4) or
+>         (x=3 and y=5) or
+>         (x=3 and y=6) or ...
+>
+> it clearly is not going to gain anything by choosing any CHEAPEST path,
+> because they are all the same in terms of cost, and the use by ODBC
+> clients is hurting reliability.
+>
+> I am inclined to agree with David's solution of breaking apart the query
+> into separate UNION queries in certain cases.  It seems to be the most
+> logical solution, because the cnf'ify code is working counter to its
+> purpose in these cases.
+>
+> Now, the question is how/where to implement this.  I see your idea of
+> making the OR a join to a temp table that holds all the constants.
+> Another idea would be to do actual UNION queries:
+>
+>         SELECT * FROM tab
+>         WHERE (x=3 and y=4)
+>         UNION
+>         SELECT * FROM tab
+>         WHERE (x=3 and y=5)
+>         UNION
+>         SELECT * FROM tab
+>         WHERE (x=3 and y=6) ...
+>
+> This would work well for tables with indexes, but for a sequential scan,
+> you are doing a sequential scan for each UNION.
+
+Practically speaking, the lack of an index concern, may not be justified.   The reason
+these queries are being generated, with this shape, is because remote data objects on the
+client side are being told that a primary key exists on these tables.  The object is told
+about these keys  in one of two ways.
+
+1.  It queries the database for the primary key of the table.  The ODBC driver serviced
+this request by querying for the attributes used in {table_name}_pkey.
+
+2.  The user manually specifies the primary key.  In this case an actual index may not
+exist.   (i.e. MS Access asks the user for this information if a primary key is not found
+in a table)
+
+The second case is the only one that would cause a problem.  Fortunately, the solution is
+simple.  Add a primary key index!
+
+My only concern is to be able to accurately identify a query with the proper signature
+before rewriting it as a UNION.   To what degree should this inspection be taken?
+
+BTW,  I would not do the rewrite on OR's without AND's since you have fixed the OR's use
+of the index.
+
+There is one other potential issue.  My experience with using arrays in tables and UNIONS
+creates problems.  There are missing array comparison operators which are used by the
+implied DISTINCT.
+
+> Another idea is
+> subselects.  Also, you have to make sure you return the proper rows,
+> keeping duplicates where they are in the base table, but not returning
+> them when the meet more than one qualification.
+>
+>         SELECT * FROM tab
+>         WHERE (x,y) IN (SELECT 3, 4
+>                         UNION
+>                         SELECT 3, 5
+>                         UNION
+>                         SELECT 3, 6)
+>
+> I believe we actually support this.  This is not going to use an index
+> on tab, so it may be slow if x and y are indexed.
+>
+> Another more bizarre solution is:
+>
+>         SELECT * FROM tab
+>         WHERE (x,y) = (SELECT 3, 4) OR
+>               (x,y) = (SELECT 3, 5) OR
+>               (x,y) = (SELECT 3, 6)
+>
+> Again, I think we do this too.  I don't think cnf'ify does anything with
+> this.  I also believe "=" uses indexes on subselects, while IN does not
+> because IN could return lots of rows, and an index is slower than a
+> non-index join on lots of rows.  Of course, now that we index OR's.
+>
+> Let me ask another question.  If I do:
+>
+>         SELECT * FROM tab WHERE x=3 OR x=4
+>
+> it works, and uses indexes.  Why can't the optimizer just not cnf'ify
+> things sometimes, and just do:
+>
+>         SELECT * FROM tab
+>         WHERE   (x=3 AND y=4) OR
+>                 (x=3 AND y=5) OR
+>                 (x=3 AND y=6)
+>
+> Why can it handle x=3 OR x=4, but not the more complicated case above,
+> without trying to be too smart?  If x,y is a multi-key index, it could
+> use that quite easily.  If not, it can do a sequentail scan and run the
+> tests.
+>
+> Another issue.  To the optimizer, x=3 and x=y are totally different.  In
+> x=3, it is a column compared to a constant, while in x=y, it is a join.
+> That makes a huge difference.
+>
+> In the case of (a=b and c=d) or e=1, you pick the best path and do the
+> a=b join, and throw in the e=1 entries.  You can't easily do both joins,
+> because you also need the e=1 stuff.
+>
+> I wounder what would happen if we prevent cnf'ifying of cases where the
+> OR represent only column = constant restrictions.
+>
+> I meant to really go through the optimizer this month, but other backend
+> items took my time.
+>
+> Can someone run some tests on disabling the cnf'ify calls.  It is my
+> understanding that with the non-cnf-ify'ed query, it can't choose an
+> optimial path, and starts to do either straight index matches,
+> sequential scans, or cartesian products where it joins every row to
+> every other row looking for a match.
+>
+> Let's say we turn off cnf-ify just for non-join queries.  Does that
+> help?
+>
+> I am not sure of the ramifications of telling the optimizer it no longer
+> has a variety of paths to choose for evaluating the query.
+
+I did not try this earlier because I thought it was too good to be true.   I was right.
+I tried commenting out the normalize() function in the cnfify().   The EXPLAIN showed a
+sequential scan and the resulting tuple set was empty.   Time will not allow me to dig
+into this further this weekend.
+
+Unless you come up with a better solution, I am going to submit my patch on Monday to
+make the Sept. 1st deadline.  It includes a SET switch to activate the rewrite so as not
+to cause problems outside the ODBC users.    We can either improve, it or yank it, by the
+Oct. 1st deadline.
+
+
+From infotecn@tin.it Mon Aug 31 03:01:51 1998
+Received: from mail.tol.it (mail.tin.it [194.243.154.49])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id DAA09740
+	for <maillist@candle.pha.pa.us>; Mon, 31 Aug 1998 03:01:48 -0400 (EDT)
+Received: from Server.InfoTecna.com (a-mz6-50.tin.it [212.216.9.113])
+          by mail.tol.it (8.8.4/8.8.4) with ESMTP
+   id JAA16451; Mon, 31 Aug 1998 09:00:35 +0200 (MET DST)
+Received: from tm3.InfoTecna.com (Tm1.InfoTecna.com [192.168.1.1])
+	by Server.InfoTecna.com (8.8.5/8.8.5) with SMTP id IAA18678;
+	Mon, 31 Aug 1998 08:53:13 +0200
+Message-Id: <3.0.5.32.19980831085312.00986cc0@MBox.InfoTecna.com>
+X-Sender: denis@MBox.InfoTecna.com
+X-Mailer: QUALCOMM Windows Eudora Light Version 3.0.5 (32)
+Date: Mon, 31 Aug 1998 08:53:12 +0200
+To: David Hartwig <daybee@bellatlantic.net>,
+        Bruce Momjian <maillist@candle.pha.pa.us>
+From: Sbragion Denis <infotecn@tin.it>
+Subject: Re: [INTERFACES] Re: [HACKERS] changes in 6.4
+Cc: hannu@trust.ee, pgsql-interfaces@postgreSQL.org, hackers@postgreSQL.org
+In-Reply-To: <35E9726E.C6E73049@bellatlantic.net>
+References: <199808290344.XAA28089@candle.pha.pa.us>
+Mime-Version: 1.0
+Content-Type: text/plain; charset="us-ascii"
+Status: RO
+
+Hello,
+
+At 11.40 30/08/98 -0400, David Hartwig wrote:
+>> Why is the system cnf'ifying the query.  Because it  wants to have a
+>> list of qualifications that are AND'ed, so it can just pick the most
+>> restrictive/cheapest, and evaluate that one first.
+
+Just a small question about all this optimizations stuff. I'm not a
+database expert but I think we are talking about a NP-complete problem.
+Could'nt we convert this optimization problem into another NP one that is
+known to have a good solution ? For example for the traveling salesman
+problem there's an alghoritm that provide a solution that's never more than
+two times the optimal one an provides results that are *really* near the
+optimal one most of the times. The simplex alghoritm may be another
+example. I think that this kind of alghoritm would be better than a
+collection ot tricks for special cases, and this tricks could be used
+anyway when special cases are detected. Furthermore I also know that exists
+a free program I used in the past that provides this kind of optimizations
+for chip design. I don't remember the exact name of the program but I
+remember it came from Berkeley university. Of course may be I'm totally
+missing the point.
+
+Hope it helps !
+
+Bye!
+
+	Dr. Sbragion Denis
+	InfoTecna
+	Tel, Fax: +39 39 2324054
+	URL: http://space.tin.it/internet/dsbragio
+
+From andreas.zeugswetter@telecom.at Mon Aug 31 06:31:13 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id GAA14231
+	for <maillist@candle.pha.pa.us>; Mon, 31 Aug 1998 06:31:12 -0400 (EDT)
+Received: from gandalf.telecom.at (gandalf.telecom.at [194.118.26.84]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id GAA21099 for <maillist@candle.pha.pa.us>; Mon, 31 Aug 1998 06:23:41 -0400 (EDT)
+Received: from zeugswettera.user.lan.at (zeugswettera.user.lan.at [10.4.123.227]) by gandalf.telecom.at (A.B.C.Delta4/8.8.8) with SMTP id MAA38132; Mon, 31 Aug 1998 12:22:07 +0200
+Received: by zeugswettera.user.lan.at with Microsoft Mail
+	id <01BDD4DA.C7F5B690@zeugswettera.user.lan.at>; Mon, 31 Aug 1998 12:27:55 +0200
+Message-ID: <01BDD4DA.C7F5B690@zeugswettera.user.lan.at>
+From: Andreas Zeugswetter <andreas.zeugswetter@telecom.at>
+To: "'maillist@candle.pha.pa.us'" <maillist@candle.pha.pa.us>
+Cc: "hackers@postgreSQL.org" <hackers@postgreSQL.org>
+Subject: AW: [INTERFACES] Re: [HACKERS] changes in 6.4
+Date: Mon, 31 Aug 1998 12:22:05 +0200
+Encoding: 31 TEXT
+Status: RO
+
+
+>Another idea would be to do actual UNION queries:
+>
+>	SELECT * FROM tab
+>	WHERE (x=3 and y=4)
+>	UNION
+>	SELECT * FROM tab
+>	WHERE (x=3 and y=5)
+>	UNION
+>	SELECT * FROM tab
+>	WHERE (x=3 and y=6) ...
+>
+>This would work well for tables with indexes, but for a sequential scan,
+>you are doing a sequential scan for each UNION.
+
+The most important Application for this syntax will be M$ Access
+because it uses this syntax to display x rows from a table in a particular
+sort order. In this case x and y will be the primary key and therefore have a
+unique index. So I think this special case should work good.
+
+The strategy could be something like:
+iff x, y is a unique index
+	do the union access path
+else
+	do something else
+done
+
+I think hand written SQL can always be rewritten if it is not fast enough
+using this syntax.
+
+Andreas
+
+
+From owner-pgsql-patches@hub.org Tue Sep  1 02:01:10 1998
+Received: from hub.org (hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id CAA28687
+	for <maillist@candle.pha.pa.us>; Tue, 1 Sep 1998 02:01:06 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id BAA02180; Tue, 1 Sep 1998 01:48:43 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Tue, 01 Sep 1998 01:47:48 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id BAA02160 for pgsql-patches-outgoing; Tue, 1 Sep 1998 01:47:46 -0400 (EDT)
+Received: from iconmail.bellatlantic.net (iconmail.bellatlantic.net [199.173.162.30]) by hub.org (8.8.8/8.7.5) with ESMTP id BAA02147 for <pgsql-patches@postgreSQL.org>; Tue, 1 Sep 1998 01:47:42 -0400 (EDT)
+Received: from bellatlantic.net (client196-126-3.bellatlantic.net [151.196.126.3])
+	by iconmail.bellatlantic.net (IConNet Sendmail) with ESMTP id XAA27530
+	for <pgsql-patches@postgreSQL.org>; Mon, 31 Aug 1998 23:24:07 -0400 (EDT)
+Message-ID: <35EB2B33.EBF1E9AA@bellatlantic.net>
+Date: Mon, 31 Aug 1998 19:01:07 -0400
+From: David Hartwig <daybee@bellatlantic.net>
+Organization: Insight Distribution Systems
+X-Mailer: Mozilla 4.04 [en] (X11; I; Linux 2.0.29 i586)
+MIME-Version: 1.0
+To: patches <pgsql-patches@postgreSQL.org>
+Subject: [PATCHES] Interim AND/OR memory exaustion fix.
+Content-Type: multipart/mixed; boundary="------------BEFD1E6DA78A2DC20B524E32"
+Sender: owner-pgsql-patches@hub.org
+Precedence: bulk
+Status: ROr
+
+This is a multi-part message in MIME format.
+--------------BEFD1E6DA78A2DC20B524E32
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+
+I will be cleaning this up more before the Oct 1 deadline.
+
+--------------BEFD1E6DA78A2DC20B524E32
+Content-Type: text/plain; charset=us-ascii; name="keyset.patch"
+Content-Transfer-Encoding: 7bit
+Content-Disposition: inline; filename="keyset.patch"
+
+*** ./backend/commands/variable.c.orig	Thu Jul 30 19:25:26 1998
+--- ./backend/commands/variable.c	Mon Aug 31 17:23:32 1998
+***************
+*** 24,29 ****
+--- 24,30 ----
+  extern bool _use_geqo_;
+  extern int32 _use_geqo_rels_;
+  extern bool _use_right_sided_plans_;
+ extern bool _use_keyset_query_optimizer;
+  
+  /*-----------------------------------------------------------------------*/
+  static const char *
+***************
+*** 559,564 ****
+--- 560,568 ----
+  	},
+  #endif
+  	{
+ 		"ksqo", parse_ksqo, show_ksqo, reset_ksqo
+ 	},
+ 	{
+  		NULL, NULL, NULL, NULL
+  	}
+  };
+***************
+*** 611,615 ****
+--- 615,663 ----
+  
+  	elog(NOTICE, "Unrecognized variable %s", name);
+  
+ 	return TRUE;
+ }
+ 
+ 
+ /*-----------------------------------------------------------------------
+ KSQO code will one day be unnecessary when the optimizer makes use of 
+ indexes when multiple ORs are specified in the where clause.
+ See optimizer/prep/prepkeyset.c for more on this.
+ 	daveh@insightdist.com    6/16/98
+ -----------------------------------------------------------------------*/
+ bool
+ parse_ksqo(const char *value)
+ {
+ 	if (value == NULL)
+ 	{
+ 		reset_ksqo();
+ 		return TRUE;
+ 	}
+ 
+ 	if (strcasecmp(value, "on") == 0)
+ 		_use_keyset_query_optimizer = true;
+ 	else if (strcasecmp(value, "off") == 0)
+ 		_use_keyset_query_optimizer = false;
+ 	else
+ 		elog(ERROR, "Bad value for Key Set Query Optimizer (%s)", value);
+ 
+ 	return TRUE;
+ }
+ 
+ bool
+ show_ksqo()
+ {
+ 
+ 	if (_use_keyset_query_optimizer)
+ 		elog(NOTICE, "Key Set Query Optimizer is ON");
+ 	else
+ 		elog(NOTICE, "Key Set Query Optimizer is OFF");
+ 	return TRUE;
+ }
+ 
+ bool
+ reset_ksqo()
+ {
+ 	_use_keyset_query_optimizer = false;
+  	return TRUE;
+  }
+*** ./backend/optimizer/plan/planner.c.orig	Sun Aug 30 04:28:02 1998
+--- ./backend/optimizer/plan/planner.c	Mon Aug 31 17:23:32 1998
+***************
+*** 69,74 ****
+--- 69,75 ----
+  	PlannerInitPlan = NULL;
+  	PlannerPlanId = 0;
+  
+ 	transformKeySetQuery(parse);
+  	result_plan = union_planner(parse);
+  
+  	Assert(PlannerQueryLevel == 1);
+*** ./backend/optimizer/prep/Makefile.orig	Sun Apr  5 20:23:48 1998
+--- ./backend/optimizer/prep/Makefile	Mon Aug 31 17:23:32 1998
+***************
+*** 13,19 ****
+  
+  CFLAGS += -I../..
+  
+! OBJS = prepqual.o preptlist.o prepunion.o
+  
+  # not ready yet: predmig.o xfunc.o
+  
+--- 13,19 ----
+  
+  CFLAGS += -I../..
+  
+! OBJS = prepqual.o preptlist.o prepunion.o prepkeyset.o
+  
+  # not ready yet: predmig.o xfunc.o
+  
+*** ./backend/optimizer/prep/prepkeyset.c.orig	Mon Aug 31 17:23:32 1998
+--- ./backend/optimizer/prep/prepkeyset.c	Mon Aug 31 18:30:58 1998
+***************
+*** 0 ****
+--- 1,213 ----
+ /*-------------------------------------------------------------------------
+  *
+  * prepkeyset.c--
+  *	  Special preperation for keyset queries.
+  *
+  * Copyright (c) 1994, Regents of the University of California
+  *
+  *-------------------------------------------------------------------------
+  */
+ #include <stdio.h>
+ #include <string.h>
+ 
+ #include "postgres.h"
+ #include "nodes/pg_list.h"
+ #include "nodes/parsenodes.h"
+ #include "utils/elog.h"
+ 
+ #include "nodes/nodes.h"
+ #include "nodes/execnodes.h"
+ #include "nodes/plannodes.h"
+ #include "nodes/primnodes.h"
+ #include "nodes/relation.h"
+ 
+ #include "catalog/pg_type.h"
+ #include "lib/stringinfo.h"
+ #include "optimizer/planmain.h"
+ /*
+  * Node_Copy--
+  *        a macro to simplify calling of copyObject on the specified field
+  */
+ #define Node_Copy(from, newnode, field) newnode->field = copyObject(from->field)
+ 
+ /*****  DEBUG stuff
+ #define TABS {int i; printf("\n"); for (i = 0; i<level; i++) printf("\t"); }
+ static int level = 0;
+ ******/
+ 
+ bool _use_keyset_query_optimizer = FALSE;
+ 
+ static int inspectOpNode(Expr *expr);
+ static int inspectAndNode(Expr *expr);
+ static int inspectOrNode(Expr *expr);
+ 
+ /**********************************************************************
+  *   This routine transforms query trees with the following form:
+  *       SELECT a,b, ... FROM one_table WHERE
+  *        (v1 = const1 AND v2 = const2 [ vn = constn ]) OR
+  *        (v1 = const3 AND v2 = const4 [ vn = constn ]) OR
+  *        (v1 = const5 AND v2 = const6 [ vn = constn ]) OR
+  *                         ...
+  *        [(v1 = constn AND v2 = constn [ vn = constn ])]
+  *
+  *                             into 
+  *
+  *       SELECT a,b, ... FROM one_table WHERE
+  *        (v1 = const1 AND v2 = const2 [ vn = constn ]) UNION
+  *       SELECT a,b, ... FROM one_table WHERE
+  *        (v1 = const3 AND v2 = const4 [ vn = constn ]) UNION
+  *       SELECT a,b, ... FROM one_table WHERE
+  *        (v1 = const5 AND v2 = const6 [ vn = constn ]) UNION
+  *                         ...
+  *       SELECT a,b, ... FROM one_table WHERE
+  *        [(v1 = constn AND v2 = constn [ vn = constn ])]
+  *
+  *
+  *   To qualify for transformation the query must not be a sub select,
+  *   a HAVING, or a GROUP BY.   It must be a single table and have KSQO
+  *   set to 'on'.  
+  *
+  *   The primary use of this transformation is to avoid the exponrntial 
+  *   memory consumption of cnfify() and to make use of index access
+  *   methods.
+  *
+  *        daveh@insightdist.com   1998-08-31
+  *
+  *   Needs to better identify the signeture WHERE clause.
+  *   May want to also prune out duplicate where clauses.
+  **********************************************************************/
+ void
+ transformKeySetQuery(Query *origNode)
+ {
+ 	/*   Qualify as a key set query candidate  */
+ 	if (_use_keyset_query_optimizer == FALSE || 
+ 			origNode->groupClause ||
+ 			origNode->havingQual ||
+ 			origNode->hasAggs ||
+ 			origNode->utilityStmt ||
+ 			origNode->unionClause ||
+ 			origNode->unionall ||
+ 			origNode->hasSubLinks ||
+ 			origNode->commandType != CMD_SELECT)
+ 		return;
+ 
+ 	/*  Qualify single table query   */
+ 
+ 	/*  Qualify where clause */
+ 	if  ( ! inspectOrNode((Expr*)origNode->qual))  {
+ 		return;
+ 	}
+ 
+ 	/*  Copy essential elements into a union node */
+ 	/*  
+ 	elog(NOTICE, "OR_EXPR=%d, OP_EXPR=%d, AND_EXPR=%d", OR_EXPR, OP_EXPR, AND_EXPR);
+ 	elog(NOTICE, "T_List=%d, T_Expr=%d, T_Var=%d, T_Const=%d", T_List, T_Expr, T_Var, T_Const);
+ 	elog(NOTICE, "opType=%d", ((Expr*)origNode->qual)->opType);
+ 	*/
+ 	while (((Expr*)origNode->qual)->opType == OR_EXPR)  {
+ 		Query	   *unionNode = makeNode(Query);
+ 
+ 		/*   Pull up Expr =  */
+ 		unionNode->qual = lsecond(((Expr*)origNode->qual)->args);
+ 
+ 		/*   Pull up balance of tree  */
+ 		origNode->qual = lfirst(((Expr*)origNode->qual)->args);  
+ 
+ 		/*
+ 		elog(NOTICE, "origNode: opType=%d, nodeTag=%d", ((Expr*)origNode->qual)->opType, nodeTag(origNode->qual));
+ 		elog(NOTICE, "unionNode: opType=%d, nodeTag=%d", ((Expr*)unionNode->qual)->opType, nodeTag(unionNode->qual));
+ 		*/
+ 
+ 		unionNode->commandType = origNode->commandType;
+ 		unionNode->resultRelation = origNode->resultRelation;
+ 		unionNode->isPortal = origNode->isPortal;
+ 		unionNode->isBinary = origNode->isBinary;
+ 
+ 		if (origNode->uniqueFlag)
+ 			unionNode->uniqueFlag = pstrdup(origNode->uniqueFlag);
+ 
+ 		Node_Copy(origNode, unionNode, sortClause);
+ 		Node_Copy(origNode, unionNode, rtable);
+ 		Node_Copy(origNode, unionNode, targetList);
+ 
+ 		origNode->unionClause = lappend(origNode->unionClause, unionNode);
+ 	}
+ 	return;
+ }
+ 
+ 
+ 
+ 
+ static int
+ inspectOrNode(Expr *expr)
+ {
+ 	int fr = 0, sr = 0;
+ 	Expr *firstExpr, *secondExpr;
+ 
+ 	if ( ! (expr && nodeTag(expr) == T_Expr && expr->opType == OR_EXPR))
+ 		return 0;
+ 
+ 	firstExpr = lfirst(expr->args);
+ 	secondExpr = lsecond(expr->args);
+ 	if (nodeTag(firstExpr) != T_Expr || nodeTag(secondExpr) != T_Expr) 
+ 		return 0;
+ 
+ 	if (firstExpr->opType == OR_EXPR)
+ 		fr = inspectOrNode(firstExpr);
+ 	else if (firstExpr->opType == OP_EXPR)    /*   Need to make sure it is last  */
+ 		fr = inspectOpNode(firstExpr);
+ 	else if (firstExpr->opType == AND_EXPR)    /*   Need to make sure it is last  */
+ 		fr = inspectAndNode(firstExpr);
+ 
+ 
+ 	if (secondExpr->opType == AND_EXPR)
+ 		sr = inspectAndNode(secondExpr);
+ 	else if (secondExpr->opType == OP_EXPR)
+ 		sr = inspectOpNode(secondExpr);
+ 		
+ 	return (fr && sr);		
+ }
+ 
+ 
+ static int
+ inspectAndNode(Expr *expr)
+ {
+ 	int fr = 0, sr = 0;
+ 	Expr *firstExpr, *secondExpr;
+ 
+ 	if ( ! (expr && nodeTag(expr) == T_Expr && expr->opType == AND_EXPR))
+ 		return 0;
+ 
+ 	firstExpr = lfirst(expr->args);
+ 	secondExpr = lsecond(expr->args);
+ 	if (nodeTag(firstExpr) != T_Expr || nodeTag(secondExpr) != T_Expr) 
+ 		return 0;
+ 
+ 	if (firstExpr->opType == AND_EXPR)
+ 		fr = inspectAndNode(firstExpr);
+ 	else if (firstExpr->opType == OP_EXPR)
+ 		fr = inspectOpNode(firstExpr);
+ 
+ 	if (secondExpr->opType == OP_EXPR)
+ 		sr = inspectOpNode(secondExpr);
+ 		
+ 	return (fr && sr);		
+ }
+ 
+ 
+ static int
+ /******************************************************************
+  *  Return TRUE if T_Var = T_Const, else FALSE
+  *  Actually it does not test for =.    Need to do this!
+  ******************************************************************/
+ inspectOpNode(Expr *expr)
+ {
+ 	Expr *firstExpr, *secondExpr;
+ 
+ 	if (nodeTag(expr) != T_Expr || expr->opType != OP_EXPR)
+ 		return 0;
+ 
+ 	firstExpr = lfirst(expr->args);
+ 	secondExpr = lsecond(expr->args);
+ 	return  (firstExpr && secondExpr && nodeTag(firstExpr) == T_Var && nodeTag(secondExpr) == T_Const);
+ }
+*** ./include/commands/variable.h.orig	Thu Jul 30 19:27:05 1998
+--- ./include/commands/variable.h	Mon Aug 31 17:23:32 1998
+***************
+*** 54,58 ****
+--- 54,61 ----
+  extern bool show_geqo(void);
+  extern bool reset_geqo(void);
+  extern bool parse_geqo(const char *);
+ extern bool show_ksqo(void);
+ extern bool reset_ksqo(void);
+ extern bool parse_ksqo(const char *);
+  
+  #endif							/* VARIABLE_H */
+*** ./include/optimizer/planmain.h.orig	Mon Aug 31 18:27:03 1998
+--- ./include/optimizer/planmain.h	Mon Aug 31 18:26:04 1998
+***************
+*** 67,71 ****
+--- 67,72 ----
+  extern List *check_having_qual_for_aggs(Node *clause, 
+  					List *subplanTargetList, List *groupClause);
+  extern List *check_having_qual_for_vars(Node *clause, List *targetlist_so_far);
+ extern void transformKeySetQuery(Query *origNode);
+  
+  #endif							/* PLANMAIN_H */
+
+--------------BEFD1E6DA78A2DC20B524E32--
+
+
+
+From daveh@insightdist.com Thu Sep  3 12:34:48 1998
+Received: from u1.abs.net (root@u1.abs.net [207.114.0.131])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id MAA07696
+	for <maillist@candle.pha.pa.us>; Thu, 3 Sep 1998 12:34:46 -0400 (EDT)
+Received: from insightdist.com (nobody@localhost)
+	by u1.abs.net (8.9.0/8.9.0) with UUCP id MAA23590
+	for maillist@candle.pha.pa.us; Thu, 3 Sep 1998 12:17:44 -0400 (EDT)
+X-Authentication-Warning: u1.abs.net: nobody set sender to insightdist.com!daveh using -f
+Received: from ceodev by insightdist.com (AIX 3.2/UCB 5.64/4.03)
+          id AA56436; Thu, 3 Sep 1998 11:51:24 -0400
+Received: from daveh by ceodev (AIX 4.1/UCB 5.64/4.03)
+          id AA45986; Thu, 3 Sep 1998 11:51:24 -0400
+Message-Id: <35EEBBEF.2158F68A@insightdist.com>
+Date: Thu, 03 Sep 1998 11:55:28 -0400
+From: David Hartwig <daveh@insightdist.com>
+Organization: Insight Distribution Systems
+X-Mailer: Mozilla 4.05 [en] (Win95; I)
+Mime-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+Cc: David Hartwig <daybee@bellatlantic.net>, pgsql-patches@postgreSQL.org
+Subject: Re: [PATCHES] Interim AND/OR memory exaustion fix.
+References: <199809030236.WAA22888@candle.pha.pa.us>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: RO
+
+
+
+Bruce Momjian wrote:
+
+> > I will be cleaning this up more before the Oct 1 deadline.
+>
+> > *** ./backend/commands/variable.c.orig        Thu Jul 30 19:25:26 1998
+> > --- ./backend/commands/variable.c     Mon Aug 31 17:23:32 1998
+>
+> Applied.  Let's keep talking to see if we can come up with a nice
+> general solution to this.
+>
+
+Agreed.
+
+> I have been thinking, and the trouble case is a query that uses only one
+> table, and had only "column = value" statements.  I believe this can be
+> easily identified and reworked somehow.
+>
+
+If you are referring to the AND'less set of OR's, I do have plans to not let
+that qualify since you have gotten the index scan working with OR's.
+
+I also think that the qualification process should be tightened up.   For
+example force the number of AND's to be the same in each OR grouping.   And
+have at least n OR's to qualify.    We just need to head off the memory
+exhaustion.
+
+> Your subtable idea may be a good one.
+>
+
+This sounds like a 6.5 thing.   I needed to stop the bleeding for 6.4.
+
+
+From bga@mug.org Tue Sep  8 03:39:37 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA06237
+	for <maillist@candle.pha.pa.us>; Tue, 8 Sep 1998 03:39:36 -0400 (EDT)
+Received: from bgalli.mug.org (bajor.mug.org [207.158.132.1]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id DAA03648 for <maillist@candle.pha.pa.us>; Tue, 8 Sep 1998 03:38:52 -0400 (EDT)
+Received: from localhost (bga@localhost) by bgalli.mug.org (8.8.7/SCO5) with SMTP id DAA02895 for <maillist@candle.pha.pa.us>; Tue, 8 Sep 1998 03:31:26 -0400 (EDT)
+Message-Id: <199809080731.DAA02895@bgalli.mug.org>
+X-Authentication-Warning: bgalli.mug.org: bga@localhost didn't use HELO protocol
+X-Mailer: exmh version 2.0.2 2/24/98
+From: "Billy G. Allie" <Bill.Allie@mug.org>
+Reply-To: "Billy G. Allie" <Bill.Allie@mug.org>
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+Subject: Re: [HACKERS] flock patch breaks things here 
+In-reply-to: Your message of "Mon, 31 Aug 1998 00:36:34 EDT."
+             <199808310436.AAA07618@candle.pha.pa.us> 
+Mime-Version: 1.0
+Content-Type: text/plain; charset=us-ascii
+Date: Tue, 08 Sep 1998 03:31:26 -0400
+Sender: bga@mug.org
+Status: ROr
+
+Bruce Momjian writes:
+
+> I have been thinking about this.  First, we can easily use fopen(r+) to
+> check to see if the file exists, and if it does read the pid and do a
+> kill -0 to see if it is running.  If no one else does it, I will take it
+> on.
+
+It is better to use open with the O_CREAT and O_EXCL set.  If the file does not
+exist it will be created and the PID can be written to it.  If the file exists 
+then the call will fail, at which point it can be opened with fread, and the 
+PID it contains can be checked to see if it still exists with kill.  The open 
+call has the added advantage that 'The check for the existence of the file and 
+the creation of the file if it does not exist is atomic with respect to other 
+processes executing open naming the same filename in the same directory with 
+O_EXCL and O_CREAT set.' [from the UnixAWare 7 man page, open(2)].
+
+Also, you can't just delete the file, create it and write the your PID to it 
+and assume that you have the lock, you need to close the file, sleep some 
+small amount of time and then open and read the file to see if you still have 
+the lock.  If you like, I can take this task on.
+
+Oh, the postmaster must clear the PID when it exits.
+
+> 
+> Second, where to put the pid file.  There is reason to put in /tmp,
+> because it will get cleared in a reboot, and because it is locking the
+> port number 5432.  There is also reason to put it in /data because you
+> can't have more than one postmaster running on a single data directory.
+> 
+> So, we really want to lock both places.  If this is going to make it
+> easier for people to run more than one postmaster, because it will
+> prevent/warn administrators when they try and put two postmasters in the
+> same data dir or port, I say create the pid lock files both places, and
+> give the admin a clear description of what he is doing wrong in each
+> case.
+
+IHMO, the pid should be put in the data directory.  The reasoning that it will get cleared in a reboot is not sufficent since the logic used to create the PID file will delete it if the PID it contains is not a running process.  Besides, I have used systems where /tmp was not cleared out on a re-boot (for various reasons).  Also, I would rather have a script that explicitly removes the PID locking file at system statup (if it exists), in which case, it doesn't matter where it resides.
+-- 
+____       | Billy G. Allie    | Domain....: Bill.Allie@mug.org
+|  /|      | 7436 Hartwell     | Compuserve: 76337,2061
+|-/-|----- | Dearborn, MI 48126| MSN.......: B_G_Allie@email.msn.com
+|/  |LLIE  | (313) 582-1540    | 
+
+
+
+From owner-pgsql-general@hub.org Thu Oct  1 14:00:57 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA12443
+	for <maillist@candle.pha.pa.us>; Thu, 1 Oct 1998 14:00:56 -0400 (EDT)
+Received: from hub.org (majordom@hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id NAA07930 for <maillist@candle.pha.pa.us>; Thu, 1 Oct 1998 13:57:47 -0400 (EDT)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.8.8/8.8.8) with SMTP id NAA26913;
+	Thu, 1 Oct 1998 13:56:29 -0400 (EDT)
+	(envelope-from owner-pgsql-general@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 01 Oct 1998 13:55:56 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.8.8/8.8.8) id NAA26856
+	for pgsql-general-outgoing; Thu, 1 Oct 1998 13:55:54 -0400 (EDT)
+	(envelope-from owner-pgsql-general@postgreSQL.org)
+X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-general@postgreSQL.org using -f
+Received: from mail.utexas.edu (wb3-a.mail.utexas.edu [128.83.126.138])
+	by hub.org (8.8.8/8.8.8) with SMTP id NAA26840
+	for <pgsql-general@hub.org>; Thu, 1 Oct 1998 13:55:49 -0400 (EDT)
+	(envelope-from taral@mail.utexas.edu)
+Received: (qmail 1198 invoked by uid 0); 1 Oct 1998 17:55:40 -0000
+Received: from dial-24-13.ots.utexas.edu (HELO taral) (128.83.128.157)
+  by umbs-smtp-3 with SMTP; 1 Oct 1998 17:55:40 -0000
+From: "Taral" <taral@mail.utexas.edu>
+To: <pgsql-general@hub.org>
+Subject: [GENERAL] CNF vs DNF
+Date: Thu, 1 Oct 1998 12:55:39 -0500
+Message-ID: <000001bded64$b34b2200$3b291f0a@taral>
+MIME-Version: 1.0
+Content-Type: text/plain;
+	charset="iso-8859-1"
+Content-Transfer-Encoding: 7bit
+X-Priority: 3 (Normal)
+X-MSMail-Priority: Normal
+X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
+X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3155.0
+In-Reply-To: <F10BB1FAF801D111829B0060971D839F445B3D@cpsmail>
+Importance: Normal
+Sender: owner-pgsql-general@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+> select * from aa where (bb = 2 and ff = 3) or (bb = 4 and ff = 5);
+
+I've been told that the system restructures these in CNF (conjunctive normal
+form)... i.e. the above query turns into:
+
+select * from aa where (bb = 2 or bb = 4) and (ff = 3 or bb = 4) and (bb = 2
+or ff = 5) and (ff = 3 or ff = 5);
+
+Much longer and much less efficient, AFAICT. Isn't it more efficient to do a
+union of many queries (DNF) than an intersection of many subqueries (CNF)?
+Certainly remembering the subqueries takes less memory... Also, queries
+already in DNF are probably more common than queries in CNF, requiring less
+rewrite.
+
+Can someone clarify this?
+
+Taral
+
+
+
+From taral@mail.utexas.edu Fri Oct  2 01:35:42 1998
+Received: from mail.utexas.edu (wb1-a.mail.utexas.edu [128.83.126.134])
+	by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id BAA28231
+	for <maillist@candle.pha.pa.us>; Fri, 2 Oct 1998 01:35:27 -0400 (EDT)
+Received: (qmail 16318 invoked by uid 0); 2 Oct 1998 05:35:13 -0000
+Received: from dial-42-8.ots.utexas.edu (HELO taral) (128.83.111.216)
+  by umbs-smtp-1 with SMTP; 2 Oct 1998 05:35:13 -0000
+From: "Taral" <taral@mail.utexas.edu>
+To: "Bruce Momjian" <maillist@candle.pha.pa.us>
+Cc: <pgsql-general@postgreSQL.org>
+Subject: RE: [GENERAL] Long update query ? (also Re: [GENERAL] CNF vs. DNF)
+Date: Fri, 2 Oct 1998 00:35:12 -0500
+Message-ID: <000001bdedc6$6cf75d20$3b291f0a@taral>
+MIME-Version: 1.0
+Content-Type: text/plain;
+	charset="iso-8859-1"
+Content-Transfer-Encoding: 7bit
+X-Priority: 3 (Normal)
+X-MSMail-Priority: Normal
+X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
+In-Reply-To: <199810020218.WAA23299@candle.pha.pa.us>
+Importance: Normal
+X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3155.0
+Status: ROr
+
+> It currently convert to CNF so it can select the most restrictive
+> restriction and join, and use those first.  However, the CNF conversion
+> is a memory exploder for some queries, and we certainly need to have
+> another method to split up those queries into UNIONS.  I think we need
+> to code to identify those queries capable of being converted to UNIONS,
+> and do that before the query gets to the CNF section.  That would be
+> great, and David Hartwig has implemented a limited capability of doing
+> this, but we really need a general routine to do this with 100%
+> reliability.
+
+Well, if you're talking about a routine to generate a heuristic for CNF vs.
+DNF, it is possible to precalculate the query sizes for CNF and DNF
+rewrites...
+
+For conversion to CNF:
+
+At every node:
+
+if nodeType = AND then f(node) = f(left) + f(right)
+if nodeType = OR then f(node) = f(left) * f(right)
+
+f(root) = a reasonably (but not wonderful) metric
+
+For DNF just switch AND and OR in the above. You may want to compute both
+metrics and compare... take the smaller one and use that path.
+
+How to deal with other operators depends on their implementation...
+
+Taral
+
+
+From taral@mail.utexas.edu Fri Oct  2 12:48:27 1998
+Received: from mail.utexas.edu (wb4-a.mail.utexas.edu [128.83.126.140])
+	by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id MAA11438
+	for <maillist@candle.pha.pa.us>; Fri, 2 Oct 1998 12:48:25 -0400 (EDT)
+Received: (qmail 15628 invoked by uid 0); 2 Oct 1998 16:47:50 -0000
+Received: from dial-42-8.ots.utexas.edu (HELO taral) (128.83.111.216)
+  by umbs-smtp-4 with SMTP; 2 Oct 1998 16:47:50 -0000
+From: "Taral" <taral@mail.utexas.edu>
+To: "Bruce Momjian" <maillist@candle.pha.pa.us>
+Cc: <hackers@postgreSQL.org>
+Subject: RE: [GENERAL] Long update query ? (also Re: [GENERAL] CNF vs. DNF)
+Date: Fri, 2 Oct 1998 11:47:48 -0500
+Message-ID: <000301bdee24$63308740$3b291f0a@taral>
+MIME-Version: 1.0
+Content-Type: text/plain;
+	charset="iso-8859-1"
+Content-Transfer-Encoding: 7bit
+X-Priority: 3 (Normal)
+X-MSMail-Priority: Normal
+X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
+In-reply-to: <199810021640.MAA10925@candle.pha.pa.us>
+Importance: Normal
+X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3155.0
+Status: RO
+
+> > Create a temporary oid hash? (for each table selected on, I guess)
+>
+> What I did with indexes was to run the previous OR clause index
+> restrictions through the qualification code, and make sure it failed,
+> but I am not sure how that is going to work with a more complex WHERE
+> clause.  Perhaps I need to restrict this to just simple cases of
+> constants, which are easy to pick out an run through.  Doing this with
+> joins would be very hard, I think.
+
+Actually, I was thinking more of an index of returned rows... After each
+subquery, the backend would check each row to see if it was already in the
+index... Simple duplicate check, in other words. Of course, I don't know how
+well this would behave with large tables being returned...
+
+Anyone else have some ideas they want to throw in?
+
+Taral
+
+
+From taral@mail.utexas.edu Fri Oct  2 17:13:01 1998
+Received: from mail.utexas.edu (wb1-a.mail.utexas.edu [128.83.126.134])
+	by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id RAA20838
+	for <maillist@candle.pha.pa.us>; Fri, 2 Oct 1998 17:12:27 -0400 (EDT)
+Received: (qmail 17418 invoked by uid 0); 2 Oct 1998 21:12:19 -0000
+Received: from dial-46-30.ots.utexas.edu (HELO taral) (128.83.112.158)
+  by umbs-smtp-1 with SMTP; 2 Oct 1998 21:12:19 -0000
+From: "Taral" <taral@mail.utexas.edu>
+To: "Bruce Momjian" <maillist@candle.pha.pa.us>, <jwieck@debis.com>
+Cc: <hackers@postgreSQL.org>
+Subject: RE: [HACKERS] RE: [GENERAL] Long update query ? (also Re: [GENERAL] CNF vs. DNF)
+Date: Fri, 2 Oct 1998 16:12:19 -0500
+Message-ID: <000001bdee49$56c7cd40$3b291f0a@taral>
+MIME-Version: 1.0
+Content-Type: text/plain;
+	charset="iso-8859-1"
+Content-Transfer-Encoding: 7bit
+X-Priority: 3 (Normal)
+X-MSMail-Priority: Normal
+X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
+In-reply-to: <199810021758.NAA15524@candle.pha.pa.us>
+Importance: Normal
+X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3155.0
+Status: ROr
+
+> Another idea is that we rewrite queries such as:
+>
+> 	SELECT *
+> 	FROM tab
+> 	WHERE (a=1 AND b=2 AND c=3) OR
+> 	      (a=1 AND b=2 AND c=4) OR
+> 	      (a=1 AND b=2 AND c=5) OR
+> 	      (a=1 AND b=2 AND c=6)
+>
+> into:
+>
+> 	SELECT *
+> 	FROM tab
+> 	WHERE (a=1 AND b=2) AND (c=3 OR c=4 OR c=5 OR c=6)
+
+Very nice, but that's like trying to code factorization of numbers... not
+pretty, and very CPU intensive on complex queries...
+
+Taral
+
+
+From taral@mail.utexas.edu Fri Oct  2 17:49:59 1998
+Received: from mail.utexas.edu (wb2-a.mail.utexas.edu [128.83.126.136])
+	by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id RAA21488
+	for <maillist@candle.pha.pa.us>; Fri, 2 Oct 1998 17:49:52 -0400 (EDT)
+Received: (qmail 23729 invoked by uid 0); 2 Oct 1998 21:49:27 -0000
+Received: from dial-2-6.ots.utexas.edu (HELO taral) (128.83.204.22)
+  by umbs-smtp-2 with SMTP; 2 Oct 1998 21:49:27 -0000
+From: "Taral" <taral@mail.utexas.edu>
+To: "Bruce Momjian" <maillist@candle.pha.pa.us>
+Cc: <jwieck@debis.com>, <hackers@postgreSQL.org>
+Subject: RE: [HACKERS] RE: [GENERAL] Long update query ? (also Re: [GENERAL] CNF vs. DNF)
+Date: Fri, 2 Oct 1998 16:49:26 -0500
+Message-ID: <000001bdee4e$86688b20$3b291f0a@taral>
+MIME-Version: 1.0
+Content-Type: text/plain;
+	charset="iso-8859-1"
+Content-Transfer-Encoding: 7bit
+X-Priority: 3 (Normal)
+X-MSMail-Priority: Normal
+X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
+In-Reply-To: <199810022139.RAA21082@candle.pha.pa.us>
+Importance: Normal
+X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3155.0
+Status: ROr
+
+> > Very nice, but that's like trying to code factorization of
+> numbers... not
+> > pretty, and very CPU intensive on complex queries...
+>
+> Yes, but how large are the WHERE clauses going to be?  Considering the
+> cost of cnfify() and UNION, it seems like a clear win.  Is it general
+> enough to solve our problems?
+
+Could be... the examples I received where the cnfify() was really bad were
+cases where the query was submitted alredy in DNF... and where the UNION was
+a simple one. However, I don't know of any algorithms for generic
+simplification of logical constraints. One problem is resolution/selection
+of factors:
+
+SELECT * FROM a WHERE (a = 1 AND b = 2 AND c = 3) OR (a = 4 AND b = 2 AND c
+= 3) OR (a = 1 AND b = 5 AND c = 3) OR (a = 1 AND b = 2 AND c = 6);
+
+Try that on for size. You can understand why that code gets ugly, fast.
+Somebody could try coding it, but it's not a clear win to me.
+
+My original heuristic was missing one thing: "Where the heuristic fails to
+process or decide, default to CNF." Since that's the current behavior, we're
+less likely to break things.
+
+Taral
+
+
+From owner-pgsql-hackers@hub.org Fri Oct  2 19:28:09 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA23341
+	for <maillist@candle.pha.pa.us>; Fri, 2 Oct 1998 19:28:08 -0400 (EDT)
+Received: from hub.org (majordom@hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id SAA18003 for <maillist@candle.pha.pa.us>; Fri, 2 Oct 1998 18:21:37 -0400 (EDT)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.8.8/8.8.8) with SMTP id SAA01250;
+	Fri, 2 Oct 1998 18:08:02 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 02 Oct 1998 18:04:37 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.8.8/8.8.8) id SAA00847
+	for pgsql-hackers-outgoing; Fri, 2 Oct 1998 18:04:35 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-hackers@postgreSQL.org using -f
+Received: from mail.utexas.edu (wb2-a.mail.utexas.edu [128.83.126.136])
+	by hub.org (8.8.8/8.8.8) with SMTP id SAA00806
+	for <hackers@postgreSQL.org>; Fri, 2 Oct 1998 18:04:26 -0400 (EDT)
+	(envelope-from taral@mail.utexas.edu)
+Received: (qmail 29662 invoked by uid 0); 2 Oct 1998 22:04:25 -0000
+Received: from dial-2-6.ots.utexas.edu (HELO taral) (128.83.204.22)
+  by umbs-smtp-2 with SMTP; 2 Oct 1998 22:04:25 -0000
+From: "Taral" <taral@mail.utexas.edu>
+To: "Bruce Momjian" <maillist@candle.pha.pa.us>
+Cc: <jwieck@debis.com>, <hackers@postgreSQL.org>
+Subject: RE: [HACKERS] RE: [GENERAL] Long update query ? (also Re: [GENERAL] CNF vs. DNF)
+Date: Fri, 2 Oct 1998 17:04:24 -0500
+Message-ID: <000201bdee50$9d9c4320$3b291f0a@taral>
+MIME-Version: 1.0
+Content-Type: text/plain;
+	charset="iso-8859-1"
+Content-Transfer-Encoding: 7bit
+X-Priority: 3 (Normal)
+X-MSMail-Priority: Normal
+X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
+In-Reply-To: <199810022157.RAA21769@candle.pha.pa.us>
+Importance: Normal
+X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3155.0
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: ROr
+
+> How do we do that with UNION, and return the right rows.  Seems the
+> _join_ happending multiple times would be much worse than the factoring.
+
+Ok... We have two problems:
+
+1) DNF for unjoined queries.
+2) Factorization for the rest.
+
+I have some solutions for (1). Not for (2). Remember that unjoined queries
+are quite common. :)
+
+For (1), we can always try to parallel the multiple queries... especially in
+the case where a sequential search is required.
+
+Taral
+
+
+
+From owner-pgsql-hackers@hub.org Sat Oct  3 23:32:35 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA06644
+	for <maillist@candle.pha.pa.us>; Sat, 3 Oct 1998 23:31:13 -0400 (EDT)
+Received: from hub.org (root@hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id XAA26912 for <maillist@candle.pha.pa.us>; Sat, 3 Oct 1998 23:14:01 -0400 (EDT)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.8.8/8.8.8) with SMTP id WAA04407;
+	Sat, 3 Oct 1998 22:07:05 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sat, 03 Oct 1998 22:02:00 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.8.8/8.8.8) id WAA04010
+	for pgsql-hackers-outgoing; Sat, 3 Oct 1998 22:01:59 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-hackers@postgreSQL.org using -f
+Received: from candle.pha.pa.us (maillist@s5-03.ppp.op.net [209.152.195.67])
+	by hub.org (8.8.8/8.8.8) with ESMTP id WAA03968
+	for <hackers@postgreSQL.org>; Sat, 3 Oct 1998 22:00:37 -0400 (EDT)
+	(envelope-from maillist@candle.pha.pa.us)
+Received: (from maillist@localhost)
+	by candle.pha.pa.us (8.9.0/8.9.0) id VAA04640;
+	Sat, 3 Oct 1998 21:57:30 -0400 (EDT)
+From: Bruce Momjian <maillist@candle.pha.pa.us>
+Message-Id: <199810040157.VAA04640@candle.pha.pa.us>
+Subject: Re: [HACKERS] RE: [GENERAL] Long update query ? (also Re: [GENERAL] CNF vs. DNF)
+In-Reply-To: <000201bdee50$9d9c4320$3b291f0a@taral> from Taral at "Oct 2, 1998  5: 4:24 pm"
+To: taral@mail.utexas.edu (Taral)
+Date: Sat, 3 Oct 1998 21:57:30 -0400 (EDT)
+Cc: jwieck@debis.com, hackers@postgreSQL.org
+X-Mailer: ELM [version 2.4ME+ PL47 (25)]
+MIME-Version: 1.0
+Content-Type: text/plain; charset=US-ASCII
+Content-Transfer-Encoding: 7bit
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+
+I have another idea.
+
+When we cnfify, this:
+
+	(A AND B) OR (C AND D)
+
+becomes
+
+	(A OR C) AND (A OR D) AND (B OR C) AND (B OR D)
+
+however if A and C are identical, this could become:
+
+	(A OR A) AND (A OR D) AND (B OR A) AND (B OR D)
+
+and A OR A is A:
+
+	A AND (A OR D) AND (B OR A) AND (B OR D)
+
+and since we are now saying A has to be true, we can remove OR's with A:
+
+	A AND (B OR D)
+
+Much smaller, and a big win for queries like:
+
+	SELECT *
+	FROM tab
+	WHERE	(a=1 AND b=2) OR
+		(a=1 AND b=3)
+
+This becomes:
+
+		(a=1) AND (b=2 OR b=3)
+
+which is accurate, and uses our OR indexing.
+
+Seems I could code cnfify() to look for identical qualifications in two
+joined OR clauses and remove the duplicates.
+
+Sound like big win, and fairly easy and inexpensive in processing time.
+
+Comments?
+
+-- 
+  Bruce Momjian                        |  http://www.op.net/~candle
+  maillist@candle.pha.pa.us            |  (610) 853-3000
+  +  If your life is a hard drive,     |  830 Blythe Avenue
+  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
+
+
+
+From taral@mail.utexas.edu Sat Oct  3 22:43:41 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA05961
+	for <maillist@candle.pha.pa.us>; Sat, 3 Oct 1998 22:42:18 -0400 (EDT)
+Received: from mail.utexas.edu (wb2-a.mail.utexas.edu [128.83.126.136]) by renoir.op.net (o1/$Revision: 1.1 $) with SMTP id WAA25111 for <maillist@candle.pha.pa.us>; Sat, 3 Oct 1998 22:27:34 -0400 (EDT)
+Received: (qmail 25622 invoked by uid 0); 4 Oct 1998 02:26:21 -0000
+Received: from dial-42-9.ots.utexas.edu (HELO taral) (128.83.111.217)
+  by umbs-smtp-2 with SMTP; 4 Oct 1998 02:26:21 -0000
+From: "Taral" <taral@mail.utexas.edu>
+To: "Bruce Momjian" <maillist@candle.pha.pa.us>
+Cc: <jwieck@debis.com>, <hackers@postgreSQL.org>
+Subject: RE: [HACKERS] RE: [GENERAL] Long update query ? (also Re: [GENERAL] CNF vs. DNF)
+Date: Sat, 3 Oct 1998 21:26:20 -0500
+Message-ID: <000501bdef3e$5f5293a0$3b291f0a@taral>
+MIME-Version: 1.0
+Content-Type: text/plain;
+	charset="iso-8859-1"
+Content-Transfer-Encoding: 7bit
+X-Priority: 3 (Normal)
+X-MSMail-Priority: Normal
+X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
+Importance: Normal
+In-Reply-To: <199810040157.VAA04640@candle.pha.pa.us>
+X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3155.0
+Status: ROr
+
+> however if A and C are identical, this could become:
+>
+> 	(A OR A) AND (A OR D) AND (B OR A) AND (B OR D)
+>
+> and A OR A is A:
+>
+> 	A AND (A OR D) AND (B OR A) AND (B OR D)
+>
+> and since we are now saying A has to be true, we can remove OR's with A:
+>
+> 	A AND (B OR D)
+
+Very nice... and you could do that after each iteration of the rewrite,
+preventing the size from getting too big. :)
+
+I have a symbolic expression tree evaluator that would be perfect for
+this... I'll see if I can't adapt it.
+
+Can someone mail me the structures for expression trees? I don't want to
+have to excise them from the source. Please?
+
+Taral
+
+
+From daveh@insightdist.com Mon Nov  9 13:31:07 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA00997
+	for <maillist@candle.pha.pa.us>; Mon, 9 Nov 1998 13:31:00 -0500 (EST)
+Received: from u1.abs.net (root@u1.abs.net [207.114.0.131]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id NAA26657 for <maillist@candle.pha.pa.us>; Mon, 9 Nov 1998 13:10:14 -0500 (EST)
+Received: from insightdist.com (nobody@localhost)
+	by u1.abs.net (8.9.0/8.9.0) with UUCP id MAA17710
+	for maillist@candle.pha.pa.us; Mon, 9 Nov 1998 12:52:05 -0500 (EST)
+X-Authentication-Warning: u1.abs.net: nobody set sender to insightdist.com!daveh using -f
+Received: from ceodev by insightdist.com (AIX 3.2/UCB 5.64/4.03)
+          id AA43498; Mon, 9 Nov 1998 12:38:24 -0500
+Received: from daveh by ceodev (AIX 4.1/UCB 5.64/4.03)
+          id AA54446; Mon, 9 Nov 1998 12:38:23 -0500
+Message-Id: <3647296F.6F7FDDD2@insightdist.com>
+Date: Mon, 09 Nov 1998 12:42:07 -0500
+From: David Hartwig <daveh@insightdist.com>
+Organization: Insight Distribution Systems
+X-Mailer: Mozilla 4.5 [en] (Win98; I)
+X-Accept-Language: en
+Mime-Version: 1.0
+To: Bob Kruger <bkruger@mindspring.com>,
+        Bruce Momjian <maillist@candle.pha.pa.us>
+Cc: pgsql-general@postgreSQL.org, Byron Nikolaidis <byronn@insightdist.com>
+Subject: Re: [GENERAL] Incrementing a Serial Field
+References: <3.0.5.32.19981109110757.0082c950@mindspring.com>
+Content-Type: multipart/mixed;
+	boundary="------------3D3EE7F67DFC542D3928BB7E"
+Status: ROr
+
+This is a multi-part message in MIME format.
+--------------3D3EE7F67DFC542D3928BB7E
+Content-Type: multipart/alternative;
+ boundary="------------43E2CC34278FA08EFC9E0611"
+
+
+--------------43E2CC34278FA08EFC9E0611
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+
+
+
+Bob Kruger wrote:
+
+> The second question is that I noticed the ODBC bug (feature?) when linking
+> Postgres to MS Access still exists.  This bug occurs when linking a MS
+> Access table to a Postgres table, and identifying more than one field as
+> the unique record identifier.  This makes Postgres run until it exhausts
+> all available memory.  Does anyone know a way around this?  Enabling read
+> only ODBC is a feature I would like to make available, but I do not want
+> the possibility of postgres crashing because of an error on the part of a
+> MS Access user.
+>
+> BTW - Having capability to be linked to an Access database is not an
+> option.  The current project I am working on calls for that, so it is a
+> necessary evil that I hav to live with.
+>
+
+In the driver connection settings add the following line.
+
+    SET ksql TO 'on';
+
+Stands for: keyset query optimization.  This is not considered a final
+solution.  As such, it is undocumented.   Some time in the next day or so, we
+will be releasing a version of the driver which will automatically SET ksqo.
+
+You will most likely be satisfied with the results.   One problem with this
+solution, however,  is that it does not work if you have any (some kinds of?)
+arrays in the table you are browsing.   This is a sideffect of the rewrite to a
+UNION which performs an internal sort unique.
+
+Also, if you are using row versioning you may need to overload some operators
+for xid and int4.  I have included a script that will take care of this.
+
+Bruce, can I get these operators hardcoded into 6.4.1- assuming there will be
+one.   The operators  necessitated by the UNION sideffects.
+
+
+--------------43E2CC34278FA08EFC9E0611
+Content-Type: text/html; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+
+<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
+<html>
+&nbsp;
+<p>Bob Kruger wrote:
+<blockquote TYPE=CITE>The second question is that I noticed the ODBC bug
+(feature?) when linking
+<br>Postgres to MS Access still exists.&nbsp; This bug occurs when linking
+a MS
+<br>Access table to a Postgres table, and identifying more than one field
+as
+<br>the unique record identifier.&nbsp; This makes Postgres run until it
+exhausts
+<br>all available memory.&nbsp; Does anyone know a way around this?&nbsp;
+Enabling read
+<br>only ODBC is a feature I would like to make available, but I do not
+want
+<br>the possibility of postgres crashing because of an error on the part
+of a
+<br>MS Access user.
+<p>BTW - Having capability to be linked to an Access database is not an
+<br>option.&nbsp; The current project I am working on calls for that, so
+it is a
+<br>necessary evil that I hav to live with.
+<br>&nbsp;</blockquote>
+In the driver connection settings add the following line.
+<p>&nbsp;&nbsp;<tt>&nbsp; SET ksql TO 'on';</tt><tt></tt>
+<p>Stands for: keyset query optimization.&nbsp; This is not considered
+a final solution.&nbsp; As such, it is undocumented.&nbsp;&nbsp; Some time
+in the next day or so, we will be releasing a version of the driver which
+will automatically SET ksqo.
+<p>You will most likely be satisfied with the results.&nbsp;&nbsp; One
+problem with this solution, however,&nbsp; is that it does not work if
+you have any (some kinds of?) arrays in the table you are browsing.&nbsp;&nbsp;
+This is a sideffect of the rewrite to a UNION which performs an internal
+sort unique.
+<p>Also, if you are using row versioning you may need to overload some
+operators for xid and int4.&nbsp; I have included a script that will take
+care of this.
+<p>Bruce, can I get these operators hardcoded into 6.4.1- assuming there
+will be one.&nbsp;&nbsp; The operators&nbsp; necessitated by the UNION
+sideffects.
+<br>&nbsp;</html>
+
+--------------43E2CC34278FA08EFC9E0611--
+
+--------------3D3EE7F67DFC542D3928BB7E
+Content-Type: text/plain; charset=us-ascii;
+ name="xidint4.sql"
+Content-Transfer-Encoding: 7bit
+Content-Disposition: inline;
+ filename="xidint4.sql"
+
+--   Insight Distribution Systems - System V - Apr 1998
+--   @(#)xidint4.sql	1.2 :/sccs/sql/extend/s.xidint4.sql 10/2/98 13:40:19"
+
+create function int4eq(xid,int4)
+  returns bool
+  as ''
+  language 'internal';
+
+create operator = (
+        leftarg=xid,
+        rightarg=int4,
+        procedure=int4eq,
+        commutator='=',
+        negator='<>',
+        restrict=eqsel,
+        join=eqjoinsel
+        );
+
+create function int4lt(xid,xid)
+  returns bool
+  as ''
+  language 'internal';
+
+create operator < (
+        leftarg=xid,
+        rightarg=xid,
+        procedure=int4lt,
+        commutator='=',
+        negator='<>',
+        restrict=eqsel,
+        join=eqjoinsel
+        );
+
+
+
+--------------3D3EE7F67DFC542D3928BB7E--
+
+
--- a/doc/TODO.detail/flock
+++ b/doc/TODO.detail/flock
+From tgl@sss.pgh.pa.us Sun Aug 30 11:25:23 1998
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA12607
+	for <maillist@candle.pha.pa.us>; Sun, 30 Aug 1998 11:25:20 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id LAA15788;
+	Sun, 30 Aug 1998 11:23:38 -0400 (EDT)
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+cc: dz@cs.unitn.it (Massimo Dal Zotto), hackers@postgreSQL.org
+Subject: Re: [HACKERS] flock patch breaks things here 
+In-reply-to: Your message of Sun, 30 Aug 1998 08:19:52 -0400 (EDT) 
+             <199808301219.IAA08821@candle.pha.pa.us> 
+Date: Sun, 30 Aug 1998 11:23:38 -0400
+Message-ID: <15786.904490618@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Status: RO
+
+Bruce Momjian <maillist@candle.pha.pa.us> writes:
+> Can't we just have configure check for flock().  Another idea is to
+> create a 'pid' file in the pgsql/data/base directory, and do a kill -0
+> to see if it is stil running before removing the lock.
+
+The latter approach is what I was going to suggest.  Writing a pid file
+would be a fine idea anyway --- for one thing, it makes it a lot easier
+to write a "kill the postmaster" script.  Given that the postmaster
+should write a pid file, a new postmaster should look for an existing
+pid file, and try to do a kill(pid, 0) on the number contained therein.
+If this doesn't return an error, then you figure there is already a
+postmaster running, complain, and exit.  Otherwise you figure you is it,
+(re)write the pid file and away you go.  Then pqcomm.c can just
+unconditionally delete any old file that's in the way of making the
+pipe.
+
+The pidfile checking and creation probably ought to go in postmaster.c,
+not down inside pqcomm.c.  I never liked the fact that a critical
+interlock function was being done by a low-level library that one might
+not even want to invoke (if all your clients are using TCP, opening up
+the Unix-domain socket is a waste of time, no?).
+
+BTW, there is another problem with relying on flock on the socket file
+for this purpose: it opens up a hole for a denial-of-service attack.
+Anyone who can write the file can flock it.  (We already had a problem
+with DOS via creating a dummy file at /tmp/.s.PGSQL.5432, but it would
+be harder to spot the culprit with an flock-based interference.)
+
+			regards, tom lane
+
+From owner-pgsql-hackers@hub.org Sun Aug 30 12:27:41 1998
+Received: from hub.org (hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id MAA12976
+	for <maillist@candle.pha.pa.us>; Sun, 30 Aug 1998 12:27:37 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id MAA09234; Sun, 30 Aug 1998 12:24:51 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 30 Aug 1998 12:23:26 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id MAA09167 for pgsql-hackers-outgoing; Sun, 30 Aug 1998 12:23:25 -0400 (EDT)
+Received: from mambo.cs.unitn.it (mambo.cs.unitn.it [193.205.199.204]) by hub.org (8.8.8/8.7.5) with SMTP id MAA09150 for <hackers@postgreSQL.org>; Sun, 30 Aug 1998 12:23:08 -0400 (EDT)
+Received: from boogie.cs.unitn.it (dz@boogie [193.205.199.79]) by mambo.cs.unitn.it (8.6.12/8.6.12) with ESMTP id SAA29572; Sun, 30 Aug 1998 18:21:42 +0200
+Received: (from dz@localhost) by boogie.cs.unitn.it (8.8.5/8.6.9) id SAA05993; Sun, 30 Aug 1998 18:21:41 +0200
+From: Massimo Dal Zotto <dz@cs.unitn.it>
+Message-Id: <199808301621.SAA05993@boogie.cs.unitn.it>
+Subject: Re: [HACKERS] flock patch breaks things here
+To: hackers@postgreSQL.org (PostgreSQL Hackers)
+Date: Sun, 30 Aug 1998 18:21:41 +0200 (MET DST)
+Cc: tgl@sss.pgh.pa.us (Tom Lane)
+In-Reply-To: <15786.904490618@sss.pgh.pa.us> from "Tom Lane" at Aug 30, 98 11:23:38 am
+X-Mailer: ELM [version 2.4 PL24 ME4]
+MIME-Version: 1.0
+Content-Type: text/plain; charset=iso-8859-1
+Content-Transfer-Encoding: 8bit
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: ROr
+
+> 
+> Bruce Momjian <maillist@candle.pha.pa.us> writes:
+> > Can't we just have configure check for flock().  Another idea is to
+> > create a 'pid' file in the pgsql/data/base directory, and do a kill -0
+> > to see if it is stil running before removing the lock.
+> 
+> The latter approach is what I was going to suggest.  Writing a pid file
+> would be a fine idea anyway --- for one thing, it makes it a lot easier
+> to write a "kill the postmaster" script.  Given that the postmaster
+> should write a pid file, a new postmaster should look for an existing
+> pid file, and try to do a kill(pid, 0) on the number contained therein.
+> If this doesn't return an error, then you figure there is already a
+> postmaster running, complain, and exit.  Otherwise you figure you is it,
+> (re)write the pid file and away you go.  Then pqcomm.c can just
+> unconditionally delete any old file that's in the way of making the
+> pipe.
+> 
+> The pidfile checking and creation probably ought to go in postmaster.c,
+> not down inside pqcomm.c.  I never liked the fact that a critical
+> interlock function was being done by a low-level library that one might
+> not even want to invoke (if all your clients are using TCP, opening up
+> the Unix-domain socket is a waste of time, no?).
+> 
+> BTW, there is another problem with relying on flock on the socket file
+> for this purpose: it opens up a hole for a denial-of-service attack.
+> Anyone who can write the file can flock it.  (We already had a problem
+> with DOS via creating a dummy file at /tmp/.s.PGSQL.5432, but it would
+> be harder to spot the culprit with an flock-based interference.)
+
+This came to my mind, but I didn't think this would have happened so
+quickly. In my opinion the socket and the pidfile should be created in a
+directory owned by postgres, for example /tmp/.Pgsql-unix, like does X.
+
+-- 
+Massimo Dal Zotto
+
+----------------------------------------------------------------------+
+|  Massimo Dal Zotto                email:  dz@cs.unitn.it             |
+|  Via Marconi, 141                 phone:  ++39-461-534251            |
+|  38057 Pergine Valsugana (TN)     www:  http://www.cs.unitn.it/~dz/  |
+|  Italy                            pgp:  finger dz@tango.cs.unitn.it  |
+----------------------------------------------------------------------+
+
+
+From owner-pgsql-hackers@hub.org Sun Aug 30 13:01:10 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id NAA13785
+	for <maillist@candle.pha.pa.us>; Sun, 30 Aug 1998 13:01:09 -0400 (EDT)
+Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id MAA29386 for <maillist@candle.pha.pa.us>; Sun, 30 Aug 1998 12:58:24 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id MAA11406; Sun, 30 Aug 1998 12:54:48 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 30 Aug 1998 12:52:22 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id MAA11310 for pgsql-hackers-outgoing; Sun, 30 Aug 1998 12:52:20 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id MAA11296 for <hackers@postgreSQL.org>; Sun, 30 Aug 1998 12:52:13 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id MAA16094;
+	Sun, 30 Aug 1998 12:50:55 -0400 (EDT)
+To: Massimo Dal Zotto <dz@cs.unitn.it>
+cc: hackers@postgreSQL.org (PostgreSQL Hackers)
+Subject: Re: [HACKERS] flock patch breaks things here 
+In-reply-to: Your message of Sun, 30 Aug 1998 18:21:41 +0200 (MET DST) 
+             <199808301621.SAA05993@boogie.cs.unitn.it> 
+Date: Sun, 30 Aug 1998 12:50:55 -0400
+Message-ID: <16092.904495855@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: RO
+
+Massimo Dal Zotto <dz@cs.unitn.it> writes:
+> In my opinion the socket and the pidfile should be created in a
+> directory owned by postgres, for example /tmp/.Pgsql-unix, like does X.
+
+The pidfile belongs at the top level of the database directory (eg,
+/usr/local/pgsql/data/postmaster.pid), because what it actually
+represents is that there is a postmaster running *for that database
+group*.
+
+If you want to support multiple database sets on one machine (which I
+do), then the interlock has to be per database directory.  Putting the
+pidfile into a common directory would mean we'd have to invent some
+kind of pidfile naming convention to keep multiple postmasters from
+tromping on each other.  This is unnecessarily complex.
+
+I agree with you that putting the socket file into a less easily munged
+directory than /tmp would be a good idea for security.  But that's a
+separate issue.  On machines that understand stickybits for directories,
+the security hole is not really very big.
+
+At this point, the fact that /tmp/.s.PGSQL.port# is the socket path is
+effectively a version-independent aspect of the FE/BE protocol, and so
+we can't change it without breaking old applications.  I'm not sure that
+that's worth the security improvement.
+
+What I'd like to see someday is a postmaster command line switch to tell
+it to use *only* TCP connections and not create a Unix socket at all.
+That hasn't been possible so far, because we were relying on the socket
+file to provide a safety interlock against starting multiple
+postmasters.  But an interlock using a pidfile would be much better.
+(Look around; *every* other Unix daemon I know of that wants to ensure
+that there's only one of it uses a pidfile interlock.  Not file locking.
+There's a reason why that's the well-trodden path.)
+
+			regards, tom lane
+
+
+From owner-pgsql-hackers@hub.org Sun Aug 30 15:31:13 1998
+Received: from hub.org (hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id PAA15275
+	for <maillist@candle.pha.pa.us>; Sun, 30 Aug 1998 15:31:11 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id PAA22194; Sun, 30 Aug 1998 15:27:20 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 30 Aug 1998 15:23:58 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id PAA21800 for pgsql-hackers-outgoing; Sun, 30 Aug 1998 15:23:57 -0400 (EDT)
+Received: from thelab.hub.org (nat0118.mpoweredpc.net [142.177.188.118]) by hub.org (8.8.8/8.7.5) with ESMTP id PAA21696 for <hackers@postgreSQL.org>; Sun, 30 Aug 1998 15:22:51 -0400 (EDT)
+Received: from localhost (scrappy@localhost)
+	by thelab.hub.org (8.9.1/8.8.8) with SMTP id QAA18542;
+	Sun, 30 Aug 1998 16:21:29 -0300 (ADT)
+	(envelope-from scrappy@hub.org)
+X-Authentication-Warning: thelab.hub.org: scrappy owned process doing -bs
+Date: Sun, 30 Aug 1998 16:21:28 -0300 (ADT)
+From: The Hermit Hacker <scrappy@hub.org>
+To: Tom Lane <tgl@sss.pgh.pa.us>
+cc: Massimo Dal Zotto <dz@cs.unitn.it>,
+        PostgreSQL Hackers <hackers@postgreSQL.org>
+Subject: Re: [HACKERS] flock patch breaks things here 
+In-Reply-To: <16092.904495855@sss.pgh.pa.us>
+Message-ID: <Pine.BSF.4.02.9808301618350.343-100000@thelab.hub.org>
+MIME-Version: 1.0
+Content-Type: TEXT/PLAIN; charset=US-ASCII
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: RO
+
+On Sun, 30 Aug 1998, Tom Lane wrote:
+
+> Massimo Dal Zotto <dz@cs.unitn.it> writes:
+> > In my opinion the socket and the pidfile should be created in a
+> > directory owned by postgres, for example /tmp/.Pgsql-unix, like does X.
+> 
+> The pidfile belongs at the top level of the database directory (eg,
+> /usr/local/pgsql/data/postmaster.pid), because what it actually
+> represents is that there is a postmaster running *for that database
+> group*.
+
+	I have to agree with this one...but then it also negates the
+argument about the flock() DoS...*grin*
+
+	BTW...I like the kill(pid,0) solution myself, primarily because it
+is, i think, the most portable solution.  
+
+	I would not consider a patch to remove the flock() solution and
+replace it with the kill(pid,0) solution a new feature, just an
+improvement of an existing one...either way, moving the pid file (or
+socket, for that matter) from /tmp should be listed as a security related
+requirement for v6.4 :)
+
+Marc G. Fournier                                
+Systems Administrator @ hub.org 
+primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
+
+
+
+From owner-pgsql-hackers@hub.org Sun Aug 30 22:41:10 1998
+Received: from hub.org (hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id WAA01526
+	for <maillist@candle.pha.pa.us>; Sun, 30 Aug 1998 22:41:08 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id WAA29298; Sun, 30 Aug 1998 22:38:18 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 30 Aug 1998 22:35:05 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id WAA29203 for pgsql-hackers-outgoing; Sun, 30 Aug 1998 22:35:03 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id WAA29017 for <hackers@postgreSQL.org>; Sun, 30 Aug 1998 22:34:55 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id WAA20075;
+	Sun, 30 Aug 1998 22:34:41 -0400 (EDT)
+To: The Hermit Hacker <scrappy@hub.org>
+cc: PostgreSQL Hackers <hackers@postgreSQL.org>
+Subject: Re: [HACKERS] flock patch breaks things here 
+In-reply-to: Your message of Sun, 30 Aug 1998 16:21:28 -0300 (ADT) 
+             <Pine.BSF.4.02.9808301618350.343-100000@thelab.hub.org> 
+Date: Sun, 30 Aug 1998 22:34:40 -0400
+Message-ID: <20073.904530880@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: ROr
+
+The Hermit Hacker <scrappy@hub.org> writes:
+> either way, moving the pid file (or
+> socket, for that matter) from /tmp should be listed as a security related
+> requirement for v6.4 :)
+
+Huh?  There is no pid file being generated in /tmp (or anywhere else)
+at the moment.  If we do add one, it should not go into /tmp for the
+reasons I gave before.
+
+Where the Unix-domain socket file lives is an entirely separate issue.
+
+If we move the socket out of /tmp then we have just kicked away all the
+work we did to preserve backwards compatibility of the FE/BE protocol
+with existing clients.  Being able to talk to a 1.0 client isn't much
+good if you aren't listening where he's going to try to contact you.
+So I think I have to vote in favor of leaving the socket where it is.
+
+			regards, tom lane
+
+
+From owner-pgsql-hackers@hub.org Mon Aug 31 11:31:19 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA21195
+	for <maillist@candle.pha.pa.us>; Mon, 31 Aug 1998 11:31:13 -0400 (EDT)
+Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id LAA06827 for <maillist@candle.pha.pa.us>; Mon, 31 Aug 1998 11:17:41 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA24792; Mon, 31 Aug 1998 11:12:18 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 31 Aug 1998 11:10:31 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA24742 for pgsql-hackers-outgoing; Mon, 31 Aug 1998 11:10:29 -0400 (EDT)
+Received: from trillium.nmsu.edu (trillium.NMSU.Edu [128.123.5.15]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA24725 for <hackers@postgreSQL.org>; Mon, 31 Aug 1998 11:10:22 -0400 (EDT)
+Received: (from brook@localhost)
+	by trillium.nmsu.edu (8.8.8/8.8.8) id JAA03282;
+	Mon, 31 Aug 1998 09:09:01 -0600 (MDT)
+Date: Mon, 31 Aug 1998 09:09:01 -0600 (MDT)
+Message-Id: <199808311509.JAA03282@trillium.nmsu.edu>
+From: Brook Milligan <brook@trillium.NMSU.Edu>
+To: tgl@sss.pgh.pa.us
+CC: dg@informix.com, hackers@postgreSQL.org
+In-reply-to: <23042.904573041@sss.pgh.pa.us> (message from Tom Lane on Mon, 31
+	Aug 1998 10:17:21 -0400)
+Subject: Re: [HACKERS] flock patch breaks things here
+References:  <23042.904573041@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: ROr
+
+   I just came up with an idea that might help alleviate the /tmp security
+   exposure without creating a backwards-compatibility problem.  It works
+   like this:
+
+   1. During installation, create a subdirectory of /tmp to hold Postgres'
+   socket files and associated pid lockfiles.  This subdirectory should be
+   owned by the Postgres superuser and have permissions 755
+   (world-readable, writable only by Postgres superuser).  Maybe call it
+   /tmp/.pgsql --- the name should start with a dot to keep it out of the
+   way.  (Bruce points out that some systems clear /tmp during reboot, so
+   it might be that a postmaster will have to be prepared to recreate this
+   directory at startup --- anyone know if subdirectories of /tmp are
+   zapped too?  My system doesn't do that...)
+
+   ...
+
+   I notice that on my system, the X11 socket files in /tmp/.X11-unix are
+   actually symlinks to socket files in /usr/spool/sockets/X11.  I dunno if
+   it's worth our trouble to get into putting our sockets under /usr/spool
+   or /var/spool or whatever --- seems like another configuration choice to
+   mess up.  It'd be nice if the socket directory lived somewhere where the
+   parent dirs weren't world-writable, but this would mean one more thing
+   that you have to have root permissions for in order to install pgsql.
+
+It seems like we need a directory for locks (= pid files) and one for
+sockets (perhaps the same one).  I strongly suggest that the location
+for these be configurable.  By default, it might make sense to put
+them in ~pgsql/locks and ~pgsql/sockets.  It is easy (i.e., I'll be
+glad to do it) to modify configure.in to take options like
+
+	     --lock-dir=/var/spool/lock
+	     --socket-dir=/var/spool/sockets
+
+that set cc defines and have the code respond accordingly.  This way,
+those who don't care (or don't have root access) can use the defaults,
+whereas those with root access who like to keep locks and sockets in a
+common place can do so easily.  Either way, multiple postmasters (all
+compiled with the same options of course) can check the appropriate
+locks in the well-known places.  Finally, drop the link into /tmp for
+the old socket and document that it will be disappearing at some
+point, and all is fine.  
+
+If someone wants to give me some guidance on what preprocessor
+variables should be set in response to the above options (or something
+like them), I'll do the configure stuff.
+
+Cheers,
+Brook
+
+
--- a/doc/TODO.detail/fsync
+++ b/doc/TODO.detail/fsync
+From owner-pgsql-general@hub.org Fri Dec 18 06:31:23 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id GAA05554
+	for <maillist@candle.pha.pa.us>; Fri, 18 Dec 1998 06:31:21 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id EAA21127 for <maillist@candle.pha.pa.us>; Fri, 18 Dec 1998 04:46:38 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.1/8.9.1) with SMTP id EAA01409;
+	Fri, 18 Dec 1998 04:44:19 -0500 (EST)
+	(envelope-from owner-pgsql-general@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 18 Dec 1998 04:43:22 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.1/8.9.1) id EAA01093
+	for pgsql-general-outgoing; Fri, 18 Dec 1998 04:43:18 -0500 (EST)
+	(envelope-from owner-pgsql-general@postgreSQL.org)
+Received: from dune.krs.ru (dune.krs.ru [195.161.16.38])
+	by hub.org (8.9.1/8.9.1) with ESMTP id EAA01067
+	for <pgsql-general@postgreSQL.org>; Fri, 18 Dec 1998 04:43:09 -0500 (EST)
+	(envelope-from vadim@krs.ru)
+Received: from krs.ru (localhost.krs.ru [127.0.0.1])
+	by dune.krs.ru (8.8.8/8.8.7) with ESMTP id QAA16201;
+	Fri, 18 Dec 1998 16:41:44 +0700 (KRS)
+	(envelope-from vadim@krs.ru)
+Message-ID: <367A2354.E998763@krs.ru>
+Date: Fri, 18 Dec 1998 16:41:40 +0700
+From: Vadim Mikheev <vadim@krs.ru>
+Organization: OJSC Rostelecom (Krasnoyarsk)
+X-Mailer: Mozilla 4.5 [en] (X11; I; FreeBSD 2.2.6-RELEASE i386)
+X-Accept-Language: ru, en
+MIME-Version: 1.0
+To: Anton de Wet <adw@obsidian.co.za>
+CC: pgsql-general@postgreSQL.org
+Subject: Re: [GENERAL] Why PostgreSQL is better than other commerial softwares?
+References: <Pine.LNX.4.04.9812181046030.9458-100000@ra.obsidian.co.za>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Sender: owner-pgsql-general@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Anton de Wet wrote:
+> 
+> >
+> > Often quick mailing list support?
+> 
+> :-)
+> 
+> While on the subject I finally found the solution to a problem I (and one
+> or two other people) posted about without answer. (So sometimes it's slow
+> mailing list support).
+> 
+> In importing about 5 million records (which I copy in blocks of 10000) the
+> copy became linearly slower. After a friend RTFM and refered me, I used
+> the -F switch (passed by the postmaster to the backend processes) and the
+> time became linear and a LOT shorter. Import time for the 5000000 records
+> now the same (or maybe even slightly faster, I didn't accurately time
+> them) as importing the data into oracle on the same machine.
+
+"While on the subject..." -:)
+
+This is the problem of buffer manager, known for very long time:
+when copy eats all buffers, manager begins write/fsync each
+durty buffer to free buffer for new data. All updated relations
+should be fsynced _once_ @ transaction commit. You would get
+the same results without -F...
+I still have no time to implement this -:(
+
+Vadim
+
+
--- a/doc/TODO.detail/lex
+++ b/doc/TODO.detail/lex
+From selkovjr@mcs.anl.gov Sat Jul 25 05:31:05 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id FAA16564
+	for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 05:31:03 -0400 (EDT)
+Received: from antares.mcs.anl.gov (mcs.anl.gov [140.221.9.6]) by renoir.op.net (o1/$Revision: 1.1 $) with SMTP id FAA01775 for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 05:28:22 -0400 (EDT)
+Received: from mcs.anl.gov (wit.mcs.anl.gov [140.221.5.148]) by antares.mcs.anl.gov (8.6.10/8.6.10)  with ESMTP
+	id EAA28698 for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 04:27:05 -0500
+Sender: selkovjr@mcs.anl.gov
+Message-ID: <35B9968D.21CF60A2@mcs.anl.gov>
+Date: Sat, 25 Jul 1998 08:25:49 +0000
+From: "Gene Selkov, Jr." <selkovjr@mcs.anl.gov>
+Organization: MCS, Argonne Natl. Lab
+X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.32 i586)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+Subject: position-aware scanners
+References: <199807250524.BAA07296@candle.pha.pa.us>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: RO
+
+Bruce,
+
+I attached here (trough the web links) a couple examples, totally
+irrelevant to postgres but good enough to discuss token locations. I
+might as well try to patch the backend parser, though not sure how soon.
+
+
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+1. 
+
+The first c parser I wrote,
+http://wit.mcs.anl.gov/~selkovjr/unit-troff.tgz, is not very
+sophisticated, so token locations reported by yyerr() may be slightly
+incorrect (+/- one position depending on the existence and type of the
+lookahead token. It is a filter used to typeset the units of measurement
+with eqn. To use it, unpack the tar file and run make. The Makefile is
+not too generic but I built it on various systems including linux,
+freebsd and sunos 4.3. The invocation can be something like this:
+
+./check 0 parse "l**3/(mmoll*min)"
+parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
+`'(''
+
+l**3/(mmoll*min)
+      ^^^^^
+
+Now to the guts. As far as I can imagine, the only way to consistently
+keep track of each character read by the scanner (regardless of the
+length of expressions it will match) is to redefine its YY_INPUT like
+this:
+
+#undef YY_INPUT
+#define YY_INPUT(buf,result,max_size) \
+{ \
+	int c	= (int) buffer[pos++]; \
+	result = (c == '\0') ?	YY_NULL	: (buf[0] = c, 1); \
+}
+
+Here, buffer is the pointer to the origin of the string being scanned
+and pos is a global variable, similar in usage to a file pointer (you
+can both read and manipulate it at will). The buffer and the pointer are
+initialized by the function 
+
+void setString(char *s)
+{
+   buffer = s;
+   pos = 0;
+}
+
+each time the new string is to be parsed. This (exportable) function is
+part of the interface. 
+
+In this simplistic design, yyerror() is part of the scanner module and
+it uses the pos variable to report the location of unexpected tokens.
+The downside of such arrangement is that in case of error condition, you
+can't easily tell whether your context is current or lookahead token, it
+just reports the position of the last token read (be it $ (end of
+buffer) or something else):
+
+./check 0 convert "mol/foo"
+parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
+`'(''
+
+mol/foo
+       ^^^
+
+(should be at the beginning of "foo")
+
+./check 0 convert "mmol//l"        
+parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
+`'(''
+
+mmol//l
+    ^
+
+(should be at the second '/')
+
+
+I believe this is why most simple parsers made with yacc would report
+parse errors being "at or near" some token, which is fair enough if the
+expression is not too complex.
+
+
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+2. The second version of the same scanner,
+http://wit.mcs.anl.gov/~selkovjr/scanner-example.tgz, addresses this
+problem by recording exact locations of the tokens in each instance of
+the token semantic data structure. The global,
+
+UNIT_YYSTYPE unit_yylval;
+
+would be normally used to export the token semantics (including its
+original or modified text and location data) to the parser.
+Unfortunately, I cannot show you the parser part in c, because that's
+about when I stopped writing parsers in c. Instead, I included a small
+test program, test.c, that mimics the parser's expectations for the
+scanner data pretty well. I am assuming here that you are not interested
+in digging someone else's ugly guts for relatively small bit of
+information; let me know if I am wrong and I will send you the complete
+perl code (also generated with bison).
+
+To run this example, unpack the tar file and run Make. Then do
+
+  gcc test.c scanner.o
+
+and run a.out
+
+Note the line
+
+    yylval = unit_getyylval();
+
+in test.c. You will not normally need it in a c parser. It is enough to
+define yylval as an external variable and link it to yylval in yylex()
+
+In the bison-generated parser, yylval gets pushed into a stack (pointed
+to by yylsp) each time a new token is read. For each syntax rule, the
+bison macros @1, @2, ... are just shortcuts to locations in the stack 1,
+2, ... levels deep. In following code fragment, @3 refers to the
+location info for the third term in the rule (INTEGER):
+
+(sorry about perl, but I think you can do the same things in c without
+significant changes to your existing parser)
+
+term:           base    {
+                        $$ = $1;
+                        $$->{'order'} = 1;
+                }
+        |       base EXP INTEGER {
+                        $$ = $1;
+                        $$->{'order'} = @3->{'text'};
+                        $$->{'scale'} = $$->{'scale'} ** $$->{'order'};
+                        if ( $$->{'order'} == 0 ) {
+                                yyerror("Error: expecting a non-zero
+integer exponent");
+                                YYERROR;
+                        }
+                }
+
+
+which translates to:
+
+  ($yyn == 10)    && do {
+          $yyval = $yyvsa[-1];
+          $yyval->{'order'} = 1;
+          last SWITCH;
+  };
+
+  ($yyn == 11)    && do {
+          $yyval = $yyvsa[-3];
+          $yyval->{'order'} = $yylsa[-1]->{'text'}
+          $yyval->{'scale'} = $yyval->{'scale'} ** $yyval->{'order'};
+          if ( $yyval->{'order'} == 0 ) {
+                   yyerror("Error: expecting a non-zero integer
+exponent");
+                   goto yyerrlab1 ;
+          }
+          last SWITCH;
+  };
+
+In c, you will have a bit more complicated pointer arithmetic to adress
+the stack, but the usage of objects will be the same. Note here that it
+is convenient to keep all information about the token in its location
+info, (yylsa, yylsp, yylval, @n), while everything relating to the value
+of the expression, or to the parse tree, is better placed in the
+semantic stack (yyssa, yyssp, yysval, $n). Also note that in some cases
+you can do semantic checks inside rules and report useful messages
+before or instead of invoking yyerror();
+
+Finally, it is useful to make the following wrapper function around
+external yylex() in order to maintain your own token stack. Unlike the
+parser's internal stack which is only as deep as the rule being reduced,
+this one can hold all tokens recognized during the current run, and that
+can be extremely helpful for error reporting and any transformations you
+may need. In this way, you can even scan (tokenize) the whole buffer
+before handing it off to the parser (who knows, you may need a token
+ahead of what is currently seen by the parser):
+
+
+sub tokenize {
+    undef @tokenTable;
+    my ($tok, $text, $name, $unit, $first_line, $first_column,
+$last_line, $last_column);
+    
+    while ( ($tok = &UnitLex::yylex()) > 0 ) { # this is where the
+c-coded yylex is called,
+                                               # UnitLex is the perl
+extension encapsulating it                            
+       ( $text, $name, $unit, $first_line, $first_column, $last_line,
+$last_column ) = &UnitLex::getyylval;
+       push(@tokenTable, 
+           Unit::yyltype->new (
+              'token'         => $tok,
+              'text'          => $text,
+              'name'          => $name,
+              'unit'          => $unit,
+              'first_line'    => $first_line,
+              'first_column'  => $first_column,
+              'last_line'     => $last_line,
+              'last_column'   => $last_column,
+           )
+       )
+    }
+
+}
+
+
+It is now a lot easier to handle various state-related problems, such as
+backtracking and error reporting. The yylex() function as seen by the
+parser might be constructed somewhat like this:
+
+sub yylex {
+    $yylloc = $tokenTable[$tokenNo];  # $tokenNo is a global; now
+instead of a "file pointer",
+                                      # as in the first example, we have
+a "token pointer"
+    undef $yylval;
+
+
+    # disregard this; name this block "computing semantic values"       
+    if ( $yylloc->{'token'} == UNIT) {
+        $yylval = Unit::Operand->new(
+        'unit'  => Unit::Dict::unit($yylloc->{'unit'}),
+        'base'  => Unit::Dict::base($yylloc->{'unit'}),
+        'scale' => Unit::Dict::scale($yylloc->{'unit'}),
+        'scaleToBase' => Unit::Dict::scaleToBase($yylloc->{'unit'}),
+        'loc'   => $yylloc,
+       );    
+    }
+    elsif ( ($yylloc->{'token'} == INTEGER ) || ($yylloc->{'token'} ==
+POSITIVE_NUMBER) ) {
+        $yylval = Unit::Operand->new(
+          'unit' => '1',
+          'base' => '1',
+          'scale' => 1,
+          'scaleToBase' => 1,
+          'loc'   => $yylloc,
+        );
+    }
+
+    $tokenNo++;
+    return(%{$yylloc}->{'token'}); # This is all the parser needs to
+know about this token. 
+                                   # But we already made sure we saved
+everything we need to know.
+}
+
+
+Now the most interesting part, the error reporting routine:
+
+
+sub yyerror {
+    my ($str) = @_;
+    my ($message, $start, $end, $loc);
+
+    $loc = $tokenTable[$tokenNo-1]; # This is the same as to say, 
+                                    # "obtain the location info for the
+current token"
+  
+    # You may use this routine for your own purposes or let parser use
+it
+    if( $str ne 'parse error' ) {
+        $message = "$str instead of `" . $loc->{'name'} . "' <" .
+$loc->{'text'} . ">,  at line " . $loc->{'first_line'} . ":\n\
+n";
+    }
+    else {
+        $message = "unexpected token `" . $loc->{'name'} . "' <" .
+$loc->{'text'} . ">,  at line " . loc->{'first_line'} . ":\n
+\n";
+    }
+
+    $message .= $parseBuffer . "\n"; # that's the original string that
+was used to set the parser buffer
+
+    $message .= ( ' ' x ($loc->{'first_column'} + 1) ) . ( '^' x
+length($loc->{'text'}) ). "\n";
+    if( $str ne 'parse error' ) {
+        print STDERR "$str instead of `", $loc->{'name'}, "' {",
+$loc->{'text'}, "},  at line ", $loc->{'first_line'}, ":\n\n";
+    }
+    else {
+        print STDERR "unexpected token `", $loc->{'name'}, "' {",
+$loc->{'text'}, "},  at line ", $loc->{'first_line'}, ":\n\n";
+    }
+    
+    print STDERR "$parseBuffer\n";
+    print STDERR ' ' x ($loc->{'first_column'} + 1), '^' x
+length($loc->{'text'}), "\n";
+}
+
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Scanners used in these examples assume there is a single line of text on
+the input (the first_line and last_line elements of yylloc are simply
+ignored). If you want to be able to parse multi-line buffers, just add a
+lex rule for '\n' that will increment the line count and reset the pos
+variable to zero.
+
+
+Ugly as it may seem, I find this approach extremely liberating. If the
+grammar becomes too complicated for a LALR(1) parser, I can cascade
+multiple parsers. The token table can then be used to reassemble parts
+of original expression for subordinate parsers, preserving the location
+info all the way down, so that subordinate parsers can report their
+problems consistently. You probably don't need this, as SQL is very well
+thought of and has parsable grammar. But it may be of some help, for
+error reporting. 
+
+
+--Gene
+
--- a/doc/TODO.detail/limit
+++ b/doc/TODO.detail/limit
--- a/doc/TODO.detail/logging
+++ b/doc/TODO.detail/logging
+From owner-pgsql-hackers@hub.org Fri Nov 13 13:24:37 1998
+Received: from hub.org (majordom@hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA13457
+	for <maillist@candle.pha.pa.us>; Fri, 13 Nov 1998 13:24:35 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.1/8.9.1) with SMTP id NAA02464;
+	Fri, 13 Nov 1998 13:22:52 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 13 Nov 1998 13:21:14 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.1/8.9.1) id NAA02331
+	for pgsql-hackers-outgoing; Fri, 13 Nov 1998 13:21:12 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8])
+	by hub.org (8.9.1/8.9.1) with SMTP id NAA02316
+	for <pgsql-hackers@postgreSQL.org>; Fri, 13 Nov 1998 13:21:06 -0500 (EST)
+	(envelope-from wieck@sapserv.debis.de)
+Received: by orion.SAPserv.Hamburg.dsh.de 
+	for pgsql-hackers@postgreSQL.org 
+	id m0zeOEf-000EBPC; Fri, 13 Nov 98 19:46 MET
+Message-Id: <m0zeOEf-000EBPC@orion.SAPserv.Hamburg.dsh.de>
+From: jwieck@debis.com (Jan Wieck)
+Subject: [HACKERS] shmem limits and redolog
+To: pgsql-hackers@postgreSQL.org (PostgreSQL HACKERS)
+Date: Fri, 13 Nov 1998 19:46:20 +0100 (MET)
+Reply-To: jwieck@debis.com (Jan Wieck)
+X-Mailer: ELM [version 2.4 PL25]
+Content-Type: text
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: ROr
+
+Hi,
+
+    I'm  currently  hacking  around on a solution for logging all
+    database operations at query level that can recover a crashed
+    database  from  the last successful backup by redoing all the
+    commands.
+
+    Well, I wanted it to be as flexible as can. So I  decided  to
+    make  it  per  database  configurable.  One  could  say which
+    databases are logged and if a database is, if  it  is  logged
+    sync  or async (in sync mode, every COMMIT forces an fsync of
+    the actual logfile and controlfiles).
+
+    To make async mode as fast as can, I'm using a shared  memory
+    of  32K per database (not per backend) that is used as a wrap
+    around  buffer  from  the  backends  to  place  their   query
+    information.  So  the  log writer can fall a little behind if
+    there are many backends doing  different  things  that  don't
+    lock each other.
+
+    Now  I'm  a  little  in  doubt about the shared memory limits
+    reported.  Was it a good decision to use shared memory? Am  I
+    better off using socket's?
+
+    The  bad  thing  in  what  I  have  up  to now (it's far from
+    complete) is, that even if a database isn't currently logged,
+    a redolog writer is started and creates the 32K shmem segment
+    (plus a semaphore set with 5 semaphores). This is  because  I
+    plan to create commands like
+
+        ALTER DATABASE LOG MODE=ASYNC LOGDIR='/somewhere/dbname';
+
+    and the like that can be used at runtime (while more than one
+    backend is connected to the database) to turn logging on/off,
+    switch  to/from  backup  mode (all other activity is stopped)
+    etc.
+
+    So every 32 databases will require another megabyte of shared
+    memory.  The  logging  master  controls  which databases have
+    activity  and  kills  redolog  writers  after  some  time  of
+    inactivity,  and  the shmem is freed then. But it can hurt if
+    someone really has many many databases that are all  used  at
+    the same time.
+
+    What do the others say?
+
+
+Jan
+
+--
+
+#======================================================================#
+# It's easier to get forgiveness for being wrong than for being right. #
+# Let's break this rule - forgive me.                                  #
+#======================================== jwieck@debis.com (Jan Wieck) #
+
+
+
+
+From owner-pgsql-hackers@hub.org Wed Dec 16 15:46:41 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA00521
+	for <maillist@candle.pha.pa.us>; Wed, 16 Dec 1998 15:46:40 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id PAA08772 for <maillist@candle.pha.pa.us>; Wed, 16 Dec 1998 15:10:01 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.1/8.9.1) with SMTP id PAA01254;
+	Wed, 16 Dec 1998 15:06:56 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 16 Dec 1998 14:58:11 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.1/8.9.1) id OAA00660
+	for pgsql-hackers-outgoing; Wed, 16 Dec 1998 14:58:10 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8])
+	by hub.org (8.9.1/8.9.1) with SMTP id OAA00643
+	for <pgsql-hackers@postgreSQL.org>; Wed, 16 Dec 1998 14:58:05 -0500 (EST)
+	(envelope-from wieck@sapserv.debis.de)
+Received: by orion.SAPserv.Hamburg.dsh.de 
+	for pgsql-hackers@postgreSQL.org 
+	id m0zqNDo-000EBTC; Wed, 16 Dec 98 21:07 MET
+Message-Id: <m0zqNDo-000EBTC@orion.SAPserv.Hamburg.dsh.de>
+From: jwieck@debis.com (Jan Wieck)
+Subject: Re: [HACKERS] redolog - for discussion
+To: vadim@krs.ru (Vadim Mikheev)
+Date: Wed, 16 Dec 1998 21:07:00 +0100 (MET)
+Cc: jwieck@debis.com, pgsql-hackers@postgreSQL.org
+Reply-To: jwieck@debis.com (Jan Wieck)
+In-Reply-To: <3677B71D.C67462B3@krs.ru> from "Vadim Mikheev" at Dec 16, 98 08:35:25 pm
+X-Mailer: ELM [version 2.4 PL25]
+Content-Type: text
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Vadim wrote:
+
+>
+> Jan Wieck wrote:
+> >
+> >     RECOVER DATABASE {ALL | UNTIL 'datetime' | RESET};
+> >
+> ...
+> >
+> >         For  the  others, the backend starts the recovery program
+> >         which  reads  the  redolog  files,  establishes  database
+> >         connections  as  required  and reruns all the commands in
+>                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^
+> >         them. If a required logfile isn't  found,  it  tells  the
+>           ^^^^^
+>
+> I foresee problems with using _commands_ logging for
+> recovery/replication -:((
+>
+> Let's consider two concurrent updates in READ COMMITTED mode:
+>
+> update test set x = 2 where y = 1;
+>
+>    and
+>
+> update test set x = 3 where y = 1;
+>
+> The result of both committed transaction will be x = 2
+> if the 1st transaction updated row _after_ 2nd transaction
+> and x = 3 if the 2nd transaction gets row after 1st one.
+> Order of updates is not defined by order in which commands
+> begun and so order in which commands should be rerun
+> will be unknown...
+
+    Yepp,  the order in which commands begun is absolutely not of
+    interest. Locking could already delay the  execution  of  one
+    command  until  another  one  started  later has finished and
+    released the lock.  It's a classic race condition.
+
+    Thus, my plan was to log the queries just before the call  to
+    CommitTransactionCommand()  in  tcop. This has the advantage,
+    that queries which bail out with errors don't  get  into  the
+    log  at  all  and  must not get rerun. And I can set a static
+    flag to false before starting the command, which  is  set  to
+    true  in  the buffer manager when a buffer is written (marked
+    dirty), so filtering out queries that do no updates at all is
+    easy.
+
+    Unfortunately  query  level  logging get's hit by the current
+    implementation of sequence numbers. If  a  query  that  get's
+    aborted  somewhere  in the middle (maybe by a trigger) called
+    nextval() for rows processed  earlier,  the  sequence  number
+    isn't  advanced  at  recovery  time,  because  the  query  is
+    suppressed at all.   And  sequences  aren't  locked,  so  for
+    concurrently  running  queries  getting numbers from the same
+    sequence,  the  results   aren't   reproduceable.   If   some
+    application  selects  a  value  resulting from a sequence and
+    uses that later in another query, how could the redolog  know
+    that  this has changed? It's a Const in the query logged, and
+    all that corrupts the whole thing.
+
+    All that is painful and I don't see another solution yet than
+    to  hook  into  nextval(),  log  out the numbers generated in
+    normal operation and getting back the same  numbers  in  redo
+    mode.
+
+    The whole thing gets more and more complicated :-(
+
+
+Jan
+
+--
+
+#======================================================================#
+# It's easier to get forgiveness for being wrong than for being right. #
+# Let's break this rule - forgive me.                                  #
+#======================================== jwieck@debis.com (Jan Wieck) #
+
+
+
+
--- a/doc/TODO.detail/memory
+++ b/doc/TODO.detail/memory
+From owner-pgsql-hackers@hub.org Thu Nov 26 08:31:13 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id IAA24423
+	for <maillist@candle.pha.pa.us>; Thu, 26 Nov 1998 08:31:08 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id IAA04554 for <maillist@candle.pha.pa.us>; Thu, 26 Nov 1998 08:04:30 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.1/8.9.1) with SMTP id HAA03761;
+	Thu, 26 Nov 1998 07:56:37 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 26 Nov 1998 07:55:28 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.1/8.9.1) id HAA03689
+	for pgsql-hackers-outgoing; Thu, 26 Nov 1998 07:55:26 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8])
+	by hub.org (8.9.1/8.9.1) with SMTP id HAA03674
+	for <pgsql-hackers@postgreSQL.org>; Thu, 26 Nov 1998 07:55:19 -0500 (EST)
+	(envelope-from wieck@sapserv.debis.de)
+Received: by orion.SAPserv.Hamburg.dsh.de 
+	for pgsql-hackers@postgreSQL.org 
+	id m0zj13G-000EBfC; Thu, 26 Nov 98 14:01 MET
+Message-Id: <m0zj13G-000EBfC@orion.SAPserv.Hamburg.dsh.de>
+From: jwieck@debis.com (Jan Wieck)
+Subject: Re: [HACKERS] Re: memory leak with Abort Transaction
+To: takehi-s@ascii.co.jp (SHIOZAKI Takehiko)
+Date: Thu, 26 Nov 1998 14:01:42 +0100 (MET)
+Cc: pgsql-hackers@postgreSQL.org
+Reply-To: jwieck@debis.com (Jan Wieck)
+In-Reply-To: <199811261240.VAA27516@libpc01.pb.ascii.co.jp> from "SHIOZAKI Takehiko" at Nov 26, 98 09:40:19 pm
+X-Mailer: ELM [version 2.4 PL25]
+Content-Type: text
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+SHIOZAKI Takehiko wrote:
+
+>
+> Hello!
+>
+> Releasing 6.4.1 is a good news.
+> But would you confirm the following "memory leak" problem?
+> It is reproducable on 6.4 (FreeBSD 2.2.7-RELEASE).
+
+    It's  an far too old problem. And as far as I remember, there
+    are different locations in the code causing it.
+
+    One place I remember well.  It's  in  the  tcop  mainloop  in
+    PostgresMain().  The querytree list is malloc()'ed (there and
+    in the parser) and free()'d after the query  is  processed  -
+    except  the  processing of the queries bails out with elog().
+    In that case it  never  runs  over  the  free()  because  the
+    longjmp() kick's it back to the beginning of the loop.
+
+
+Jan
+
+--
+
+#======================================================================#
+# It's easier to get forgiveness for being wrong than for being right. #
+# Let's break this rule - forgive me.                                  #
+#======================================== jwieck@debis.com (Jan Wieck) #
+
+
+
+
+From owner-pgsql-hackers@hub.org Fri Mar 19 16:01:29 1999
+Received: from hub.org (majordom@hub.org [209.47.145.100])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA05828
+	for <maillist@candle.pha.pa.us>; Fri, 19 Mar 1999 16:01:22 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.2/8.9.1) with SMTP id PAA15701;
+	Fri, 19 Mar 1999 15:59:51 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 19 Mar 1999 15:59:08 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.2/8.9.1) id PAA15551
+	for pgsql-hackers-outgoing; Fri, 19 Mar 1999 15:59:05 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from andrew.cmu.edu (ANDREW.CMU.EDU [128.2.10.101])
+	by hub.org (8.9.2/8.9.1) with ESMTP id PAA15524
+	for <pgsql-hackers@postgresql.org>; Fri, 19 Mar 1999 15:58:53 -0500 (EST)
+	(envelope-from er1p+@andrew.cmu.edu)
+Received: (from postman@localhost) by andrew.cmu.edu (8.8.5/8.8.2) id PAA29323 for pgsql-hackers@postgresql.org; Fri, 19 Mar 1999 15:58:50 -0500 (EST)
+Received: via switchmail; Fri, 19 Mar 1999 15:58:50 -0500 (EST)
+Received: from cloudy.me.cmu.edu via qmail
+          ID </afs/andrew.cmu.edu/service/mailqs/q005/QF.cqwfdxK00gNtE0TVBp>;
+          Fri, 19 Mar 1999 15:58:37 -0500 (EST)
+Received: from cloudy.me.cmu.edu via qmail
+          ID </afs/andrew.cmu.edu/usr2/er1p/.Outgoing/QF.MqwfdrO00gNtEmTPh2>;
+          Fri, 19 Mar 1999 15:58:31 -0500 (EST)
+Received: from mms.4.60.Jun.27.1996.03.05.56.sun4.41.EzMail.2.0.CUILIB.3.45.SNAP.NOT.LINKED.cloudy.me.cmu.edu.sun4m.412
+          via MS.5.6.cloudy.me.cmu.edu.sun4_41;
+          Fri, 19 Mar 1999 15:58:29 -0500 (EST)
+Message-ID: <wqwfdpu00gNtImTPUm@andrew.cmu.edu>
+Date: Fri, 19 Mar 1999 15:58:29 -0500 (EST)
+From: Erik Riedel <riedel+@CMU.EDU>
+To: pgsql-hackers@postgreSQL.org
+Subject: [HACKERS] aggregation memory leak and fix
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: ROr
+
+
+Platform:  Alpha, Digital UNIX 4.0D
+Software:  PostgreSQL 6.4.2 and 6.5 snaphot (11 March 1999)
+
+I have a table as follows:
+
+Table    = lineitem
+------------------------+----------------------------------+-------+
+|              Field     |              Type                | Length|
+------------------------+----------------------------------+-------+
+| l_orderkey             | int4 not null                    |     4 |
+| l_partkey              | int4 not null                    |     4 |
+| l_suppkey              | int4 not null                    |     4 |
+| l_linenumber           | int4 not null                    |     4 |
+| l_quantity             | float4 not null                  |     4 |
+| l_extendedprice        | float4 not null                  |     4 |
+| l_discount             | float4 not null                  |     4 |
+| l_tax                  | float4 not null                  |     4 |
+| l_returnflag           | char() not null                  |     1 |
+| l_linestatus           | char() not null                  |     1 |
+| l_shipdate             | date                             |     4 |
+| l_commitdate           | date                             |     4 |
+| l_receiptdate          | date                             |     4 |
+| l_shipinstruct         | char() not null                  |    25 |
+| l_shipmode             | char() not null                  |    10 |
+| l_comment              | char() not null                  |    44 |
+------------------------+----------------------------------+-------+
+Index:    lineitem_index_
+
+that ends up having on the order of 500,000 rows (about 100 MB on disk).  
+
+I then run an aggregation query as:
+
+--
+-- Query 1
+--
+select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, 
+sum(l_extendedprice) as sum_base_price, 
+sum(l_extendedprice*(1-l_discount)) as sum_disc_price, 
+sum(l_extendedprice*(1-l_discount)*(1+l_tax)) as sum_charge, 
+avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, 
+avg(l_discount) as avg_disc, count(*) as count_order 
+from lineitem 
+where l_shipdate <= ('1998-12-01'::datetime - interval '90 day')::date 
+group by l_returnflag, l_linestatus 
+order by l_returnflag, l_linestatus;
+
+
+when I run this against 6.4.2, the postgres process grows to upwards of
+1 GB of memory (at which point something overflows and it dumps core) -
+I watch it grow through 200 MB, 400 MB, 800 MB, dies somewhere near 1 GB
+of allocated memory).
+
+If I take out a few of the "sum" expressions it gets better, removing
+sum_disk_price and sum_charge causes it to be only 600 MB and the query
+actually (eventually) completes.  Takes about 10 minutes on my 500 MHz
+machine with 256 MB core and 4 GB of swap.
+
+The problem seems to be the memory allocation mechanism.  Looking at a
+call trace, it is doing some kind of "sub query" plan for each row in
+the database.  That means it does ExecEval and postquel_function and
+postquel_execute and all their friends for each row in the database. 
+Allocating a couple hundred bytes for each one.
+
+The problem is that none of these allocations are freed - they seem to
+depend on the AllocSet to free them at the end of the transaction.  This
+means it isn't a "true" leak, because the bytes are all freed at the
+(very) end of the transaction, but it does mean that the process grows
+to unreasonable size in the meantime.  There is no need for this,
+because the individual expression results are aggregated as it goes
+along, so the intermediate nodes can be freed.
+
+I spent half a day last week chasing down the offending palloc() calls
+and execution stacks sufficiently that I think I found the right places
+to put pfree() calls.
+
+As a result, I have changes in the files:
+
+src/backend/executor/execUtils.c
+src/backend/executor/nodeResult.c 
+src/backend/executor/nodeAgg.c 
+src/backend/executor/execMain.c 
+
+patches to these files are attached at the end of this message.  These
+files are based on the 6.5.0 snapshot downloaded from ftp.postgreql.org
+on 11 March 1999.
+
+Apologies for sending patches to a non-released version.  If anyone has
+problems applying the patches, I can send the full files (I wanted to
+avoid sending a 100K shell archive to the list).  If anyone cares about
+reproducing my exact problem with the above table, I can provide the 100
+MB pg_dump file for download as well.
+
+Secondary Issue:  the reason I did not use the 6.4.2 code to make my
+changes is because the AllocSet calls in that one were particularly
+egregious - they only had the skeleton of the allocsets code that exists
+in the 6.5 snapshots, so they were calling malloc() for all of the 8 and
+16 byte allocations that the above query causes.
+
+Using the fixed code reduces the maximum memory requirement on the above
+query to about 210 MB, and reduces the runtime to (an acceptable) 1.5
+minutes - a factor of more than 6x improvement on my 256 MB machine.
+
+Now the biggest part of the execution time is in the sort before the
+aggregation (which isn't strictly needed, but that is an optimization
+for another day).
+
+Open Issue: there is still a small "leak" that I couldn't eliminate, I
+think I chased it down to the constvalue allocated in
+execQual::ExecTargetList(), but I couldn't figure out where to properly
+free it.  8 bytes leaked was much better than 750 bytes, so I stopped
+banging my head on that particular item.
+
+Secondary Open Issue: what I did have to do to get down to 210 MB of
+core was reduce the minimum allocation size in AllocSet to 8 bytes from
+16 bytes.  That reduces the 8 byte leak above to a true 8 byte, rather
+than a 16 byte leak.  Otherwise, I think the size was 280 MB (still a
+big improvement on 1000+ MB).  I only changed this in my code and I am
+not including a changed mcxt.c for that.
+
+I hope my changes are understandable/reasonable.  Enjoy.
+
+Erik Riedel
+Carnegie Mellon University
+www.cs.cmu.edu/~riedel
+
+--------------[aggregation_memory_patch.sh]-----------------------
+
+#! /bin/sh
+# This is a shell archive, meaning:
+# 1. Remove everything above the #! /bin/sh line.
+# 2. Save the resulting text in a file.
+# 3. Execute the file with /bin/sh (not csh) to create:
+#	execMain.c.diff
+#	execUtils.c.diff
+#	nodeAgg.c.diff
+#	nodeResult.c.diff
+# This archive created: Fri Mar 19 15:47:17 1999
+export PATH; PATH=/bin:/usr/bin:$PATH
+if test -f 'execMain.c.diff'
+then
+	echo shar: "will not over-write existing file 'execMain.c.diff'"
+else
+cat << \SHAR_EOF > 'execMain.c.diff'
+583c
+	
+.
+398a
+
+.
+396a
+	/* XXX - clean up some more from ExecutorStart() - er1p */
+	if (NULL == estate->es_snapshot) {
+	  /* nothing to free */
+	} else {
+	  if (estate->es_snapshot->xcnt > 0) { 
+	    pfree(estate->es_snapshot->xip);
+	  }
+	  pfree(estate->es_snapshot);
+	}
+
+	if (NULL == estate->es_param_exec_vals) {
+	  /* nothing to free */
+	} else {
+	  pfree(estate->es_param_exec_vals);
+	  estate->es_param_exec_vals = NULL;
+	}
+
+.
+SHAR_EOF
+fi
+if test -f 'execUtils.c.diff'
+then
+	echo shar: "will not over-write existing file 'execUtils.c.diff'"
+else
+cat << \SHAR_EOF > 'execUtils.c.diff'
+368a
+}
+
+/* ----------------
+ *		ExecFreeExprContext
+ * ----------------
+ */
+void
+ExecFreeExprContext(CommonState *commonstate)
+{
+	ExprContext *econtext;
+
+	/* ----------------
+	 *	get expression context.  if NULL then this node has
+	 *	none so we just return.
+	 * ----------------
+	 */
+	econtext = commonstate->cs_ExprContext;
+	if (econtext == NULL)
+		return;
+
+	/* ----------------
+	 *	clean up memory used.
+	 * ----------------
+	 */
+	pfree(econtext);
+	commonstate->cs_ExprContext = NULL;
+}
+
+/* ----------------
+ *		ExecFreeTypeInfo
+ * ----------------
+ */
+void
+ExecFreeTypeInfo(CommonState *commonstate)
+{
+	TupleDesc tupDesc;
+
+	tupDesc = commonstate->cs_ResultTupleSlot->ttc_tupleDescriptor;
+	if (tupDesc == NULL)
+		return;
+
+	/* ----------------
+	 *	clean up memory used.
+	 * ----------------
+	 */
+	FreeTupleDesc(tupDesc);
+	commonstate->cs_ResultTupleSlot->ttc_tupleDescriptor = NULL;
+.
+274a
+
+.
+SHAR_EOF
+fi
+if test -f 'nodeAgg.c.diff'
+then
+	echo shar: "will not over-write existing file 'nodeAgg.c.diff'"
+else
+cat << \SHAR_EOF > 'nodeAgg.c.diff'
+376a
+						pfree(oldVal); /* XXX - new, let's free the old datum - er1p */
+.
+374a
+						oldVal = value1[aggno]; /* XXX - save so we can free later - er1p */
+.
+112a
+	Datum  oldVal = (Datum) NULL;  /* XXX - so that we can save and free on
+each iteration - er1p */
+.
+SHAR_EOF
+fi
+if test -f 'nodeResult.c.diff'
+then
+	echo shar: "will not over-write existing file 'nodeResult.c.diff'"
+else
+cat << \SHAR_EOF > 'nodeResult.c.diff'
+278a
+	pfree(resstate); node->resstate = NULL; /* XXX - new for us - er1p */
+.
+265a
+	ExecFreeExprContext(&resstate->cstate); /* XXX - new for us - er1p */
+	ExecFreeTypeInfo(&resstate->cstate); /* XXX - new for us - er1p */
+.
+SHAR_EOF
+fi
+exit 0
+#	End of shell archive
+
+
+
+From er1p+@andrew.cmu.edu Fri Mar 19 19:43:27 1999
+Received: from po8.andrew.cmu.edu (PO8.ANDREW.CMU.EDU [128.2.10.108])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA09183
+	for <maillist@candle.pha.pa.us>; Fri, 19 Mar 1999 19:43:26 -0500 (EST)
+Received: (from postman@localhost) by po8.andrew.cmu.edu (8.8.5/8.8.2) id TAA11773; Fri, 19 Mar 1999 19:43:18 -0500 (EST)
+Received: via switchmail; Fri, 19 Mar 1999 19:43:18 -0500 (EST)
+Received: from cloudy.me.cmu.edu via qmail
+          ID </afs/andrew.cmu.edu/service/mailqs/q007/QF.oqwiwLK00gNtQmTLgB>;
+          Fri, 19 Mar 1999 19:43:05 -0500 (EST)
+Received: from mms.4.60.Jun.27.1996.03.05.56.sun4.41.EzMail.2.0.CUILIB.3.45.SNAP.NOT.LINKED.cloudy.me.cmu.edu.sun4m.412
+          via MS.5.6.cloudy.me.cmu.edu.sun4_41;
+          Fri, 19 Mar 1999 19:43:02 -0500 (EST)
+Message-ID: <YqwiwKW00gNt4mTKsv@andrew.cmu.edu>
+Date: Fri, 19 Mar 1999 19:43:02 -0500 (EST)
+From: Erik Riedel <riedel+@CMU.EDU>
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+Subject: Re: [HACKERS] aggregation memory leak and fix
+Cc: pgsql-hackers@postgreSQL.org
+In-Reply-To: <199903192223.RAA06691@candle.pha.pa.us>
+References: <199903192223.RAA06691@candle.pha.pa.us>
+Status: ROr
+
+
+> No apologies necessary.  Glad to have someone digging into that area of
+> the code.  We will gladly apply your patches to 6.5.  However, I request
+> that you send context diffs(diff -c).  Normal diffs are just too
+> error-prone in application.   Send them, and I will apply them right
+> away.
+>  
+Context diffs attached.  This was due to my ignorance of diff.  When I
+made the other files, I though "hmm, these could be difficult to apply
+if the code has changed a bit, wouldn't it be good if they included a
+few lines before and after the fix".  Now I know "-c".
+
+> Not sure why that is there?  Perhaps for GROUP BY processing?
+>  
+Right, it is a result of the Group processing requiring sorted input. 
+Just that it doesn't "require" sorted input, it "could" be a little more
+flexible and the sort wouldn't be necessary.  Essentially this would be
+a single "AggSort" node that did the aggregation while sorting (probably
+with replacement selection rather than quicksort).  This definitely
+would require some code/smarts that isn't there today.
+
+> > think I chased it down to the constvalue allocated in
+> > execQual::ExecTargetList(), but I couldn't figure out where to properly
+> > free it.  8 bytes leaked was much better than 750 bytes, so I stopped
+> > banging my head on that particular item.
+>  
+> Can you give me the exact line?  Is it the palloc(1)?
+>  
+No, the 8 bytes seem to come from the ExecEvalExpr() call near line
+1530.  Problem was when I tried to free these, I got "not in AllocSet"
+errors, so something more complicated was going on.
+
+Thanks.
+
+Erik
+
+-----------[aggregation_memory_patch.sh]----------------------
+
+#! /bin/sh
+# This is a shell archive, meaning:
+# 1. Remove everything above the #! /bin/sh line.
+# 2. Save the resulting text in a file.
+# 3. Execute the file with /bin/sh (not csh) to create:
+#	execMain.c.diff
+#	execUtils.c.diff
+#	nodeAgg.c.diff
+#	nodeResult.c.diff
+# This archive created: Fri Mar 19 19:35:42 1999
+export PATH; PATH=/bin:/usr/bin:$PATH
+if test -f 'execMain.c.diff'
+then
+	echo shar: "will not over-write existing file 'execMain.c.diff'"
+else
+cat << \SHAR_EOF > 'execMain.c.diff'
+***
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/611/src/backend/executor/
+execMain.c	Thu Mar 11 23:59:11 1999
+---
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/612/src/backend/executor/
+execMain.c	Fri Mar 19 15:03:28 1999
+***************
+*** 394,401 ****
+--- 394,419 ----
+  
+  	EndPlan(queryDesc->plantree, estate);
+  
+ 	/* XXX - clean up some more from ExecutorStart() - er1p */
+ 	if (NULL == estate->es_snapshot) {
+ 	  /* nothing to free */
+ 	} else {
+ 	  if (estate->es_snapshot->xcnt > 0) { 
+ 	    pfree(estate->es_snapshot->xip);
+ 	  }
+ 	  pfree(estate->es_snapshot);
+ 	}
+ 
+ 	if (NULL == estate->es_param_exec_vals) {
+ 	  /* nothing to free */
+ 	} else {
+ 	  pfree(estate->es_param_exec_vals);
+ 	  estate->es_param_exec_vals = NULL;
+ 	}
+ 
+  	/* restore saved refcounts. */
+  	BufferRefCountRestore(estate->es_refcount);
+ 
+  }
+  
+  void
+***************
+*** 580,586 ****
+  	/*
+  	 *	initialize result relation stuff
+  	 */
+! 
+  	if (resultRelation != 0 && operation != CMD_SELECT)
+  	{
+  		/*
+--- 598,604 ----
+  	/*
+  	 *	initialize result relation stuff
+  	 */
+! 	
+  	if (resultRelation != 0 && operation != CMD_SELECT)
+  	{
+  		/*
+SHAR_EOF
+fi
+if test -f 'execUtils.c.diff'
+then
+	echo shar: "will not over-write existing file 'execUtils.c.diff'"
+else
+cat << \SHAR_EOF > 'execUtils.c.diff'
+***
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/611/src/backend/executor/
+execUtils.c	Thu Mar 11 23:59:11 1999
+---
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/612/src/backend/executor/
+execUtils.c	Fri Mar 19 14:55:59 1999
+***************
+*** 272,277 ****
+--- 272,278 ----
+  #endif
+  		i++;
+  	}
+ 
+  	if (len > 0)
+  	{
+  		ExecAssignResultType(commonstate,
+***************
+*** 366,371 ****
+--- 367,419 ----
+  
+  	pfree(projInfo);
+  	commonstate->cs_ProjInfo = NULL;
+ }
+ 
+ /* ----------------
+  *		ExecFreeExprContext
+  * ----------------
+  */
+ void
+ ExecFreeExprContext(CommonState *commonstate)
+ {
+ 	ExprContext *econtext;
+ 
+ 	/* ----------------
+ 	 *	get expression context.  if NULL then this node has
+ 	 *	none so we just return.
+ 	 * ----------------
+ 	 */
+ 	econtext = commonstate->cs_ExprContext;
+ 	if (econtext == NULL)
+ 		return;
+ 
+ 	/* ----------------
+ 	 *	clean up memory used.
+ 	 * ----------------
+ 	 */
+ 	pfree(econtext);
+ 	commonstate->cs_ExprContext = NULL;
+ }
+ 
+ /* ----------------
+  *		ExecFreeTypeInfo
+  * ----------------
+  */
+ void
+ ExecFreeTypeInfo(CommonState *commonstate)
+ {
+ 	TupleDesc tupDesc;
+ 
+ 	tupDesc = commonstate->cs_ResultTupleSlot->ttc_tupleDescriptor;
+ 	if (tupDesc == NULL)
+ 		return;
+ 
+ 	/* ----------------
+ 	 *	clean up memory used.
+ 	 * ----------------
+ 	 */
+ 	FreeTupleDesc(tupDesc);
+ 	commonstate->cs_ResultTupleSlot->ttc_tupleDescriptor = NULL;
+  }
+  
+  /* ----------------------------------------------------------------
+SHAR_EOF
+fi
+if test -f 'nodeAgg.c.diff'
+then
+	echo shar: "will not over-write existing file 'nodeAgg.c.diff'"
+else
+cat << \SHAR_EOF > 'nodeAgg.c.diff'
+***
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/611/src/backend/executor/
+nodeAgg.c	Thu Mar 11 23:59:11 1999
+---
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/612/src/backend/executor/
+nodeAgg.c	Fri Mar 19 15:01:21 1999
+***************
+*** 110,115 ****
+--- 110,116 ----
+  				isNull2 = FALSE;
+  	bool		qual_result;
+  
+ 	Datum  oldVal = (Datum) NULL;  /* XXX - so that we can save and free
+on each iteration - er1p */
+  
+  	/* ---------------------
+  	 *	get state info from node
+***************
+*** 372,379 ****
+--- 373,382 ----
+  						 */
+  						args[0] = value1[aggno];
+  						args[1] = newVal;
+ 						oldVal = value1[aggno]; /* XXX - save so we can free later - er1p */
+  						value1[aggno] =	(Datum) fmgr_c(&aggfns->xfn1,
+  										   (FmgrValues *) args, &isNull1);
+ 						pfree(oldVal); /* XXX - new, let's free the old datum - er1p */
+  						Assert(!isNull1);
+  					}
+  				}
+SHAR_EOF
+fi
+if test -f 'nodeResult.c.diff'
+then
+	echo shar: "will not over-write existing file 'nodeResult.c.diff'"
+else
+cat << \SHAR_EOF > 'nodeResult.c.diff'
+***
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/611/src/backend/executor/
+nodeResult.c	Thu Mar 11 23:59:12 1999
+---
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/612/src/backend/executor/
+nodeResult.c	Fri Mar 19 14:57:26 1999
+***************
+*** 263,268 ****
+--- 263,270 ----
+  	 *		  is freed at end-transaction time.  -cim 6/2/91
+  	 * ----------------
+  	 */
+ 	ExecFreeExprContext(&resstate->cstate); /* XXX - new for us - er1p */
+ 	ExecFreeTypeInfo(&resstate->cstate); /* XXX - new for us - er1p */
+  	ExecFreeProjectionInfo(&resstate->cstate);
+  
+  	/* ----------------
+***************
+*** 276,281 ****
+--- 278,284 ----
+  	 * ----------------
+  	 */
+  	ExecClearTuple(resstate->cstate.cs_ResultTupleSlot);
+ 	pfree(resstate); node->resstate = NULL; /* XXX - new for us - er1p */
+  }
+  
+  void
+SHAR_EOF
+fi
+exit 0
+#	End of shell archive
+
+
+From owner-pgsql-hackers@hub.org Fri Mar 19 21:01:15 1999
+Received: from hub.org (majordom@hub.org [209.47.145.100])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA11368
+	for <maillist@candle.pha.pa.us>; Fri, 19 Mar 1999 21:01:13 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.2/8.9.1) with SMTP id UAA40887;
+	Fri, 19 Mar 1999 20:59:47 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 19 Mar 1999 20:58:14 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.2/8.9.1) id UAA40637
+	for pgsql-hackers-outgoing; Fri, 19 Mar 1999 20:58:12 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from candle.pha.pa.us (maillist@s5-03.ppp.op.net [209.152.195.67])
+	by hub.org (8.9.2/8.9.1) with ESMTP id UAA40620
+	for <pgsql-hackers@postgreSQL.org>; Fri, 19 Mar 1999 20:58:02 -0500 (EST)
+	(envelope-from maillist@candle.pha.pa.us)
+Received: (from maillist@localhost)
+	by candle.pha.pa.us (8.9.0/8.9.0) id UAA11263;
+	Fri, 19 Mar 1999 20:58:00 -0500 (EST)
+From: Bruce Momjian <maillist@candle.pha.pa.us>
+Message-Id: <199903200158.UAA11263@candle.pha.pa.us>
+Subject: Re: [HACKERS] aggregation memory leak and fix
+In-Reply-To: <YqwiwKW00gNt4mTKsv@andrew.cmu.edu> from Erik Riedel at "Mar 19, 1999  7:43: 2 pm"
+To: riedel+@CMU.EDU (Erik Riedel)
+Date: Fri, 19 Mar 1999 20:58:00 -0500 (EST)
+Cc: pgsql-hackers@postgreSQL.org
+X-Mailer: ELM [version 2.4ME+ PL47 (25)]
+MIME-Version: 1.0
+Content-Type: text/plain; charset=US-ASCII
+Content-Transfer-Encoding: 7bit
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: ROr
+
+> 
+> > No apologies necessary.  Glad to have someone digging into that area of
+> > the code.  We will gladly apply your patches to 6.5.  However, I request
+> > that you send context diffs(diff -c).  Normal diffs are just too
+> > error-prone in application.   Send them, and I will apply them right
+> > away.
+> >  
+> Context diffs attached.  This was due to my ignorance of diff.  When I
+> made the other files, I though "hmm, these could be difficult to apply
+> if the code has changed a bit, wouldn't it be good if they included a
+> few lines before and after the fix".  Now I know "-c".
+
+Applied.
+
+> > Not sure why that is there?  Perhaps for GROUP BY processing?
+> >  
+> Right, it is a result of the Group processing requiring sorted input. 
+> Just that it doesn't "require" sorted input, it "could" be a little more
+> flexible and the sort wouldn't be necessary.  Essentially this would be
+> a single "AggSort" node that did the aggregation while sorting (probably
+> with replacement selection rather than quicksort).  This definitely
+> would require some code/smarts that isn't there today.
+
+I think you will find make_groupPlan adds the sort as needed by the
+GROUP BY.  I assume you are suggesting to do the aggregate/GROUP on unsorted
+data, which is hard to do in a flexible way.
+
+> > > think I chased it down to the constvalue allocated in
+> > > execQual::ExecTargetList(), but I couldn't figure out where to properly
+> > > free it.  8 bytes leaked was much better than 750 bytes, so I stopped
+> > > banging my head on that particular item.
+> >  
+> > Can you give me the exact line?  Is it the palloc(1)?
+> >  
+> No, the 8 bytes seem to come from the ExecEvalExpr() call near line
+> 1530.  Problem was when I tried to free these, I got "not in AllocSet"
+> errors, so something more complicated was going on.
+
+Yes, if you look inside ExecEvalExpr(), you will see it tries to get a
+value for the expression(Datum).  It may return an int, float4, or a
+string.  In the last case, that is actually a pointer and not a specific
+value.
+
+So, in some cases, the value can just be thrown away, or it may be a
+pointer to memory that can be freed after the call to heap_formtuple()
+later in the function.  The trick is to find the function call in
+ExecEvalExpr() that is allocating something, and conditionally free
+values[] after the call to heap_formtuple().  If you don't want find it,
+perhaps you can send me enough info so I can see it here.
+
+I wonder whether it is the call to CreateTupleDescCopy() inside
+ExecEvalVar()?
+
+Another problem I just fixed is that fjIsNull was not being pfree'ed if
+it was used with >64 targets, but I don't think that affects you.
+
+I also assume you have run your recent patch through the the
+test/regression tests, so see it does not cause some other area to fail,
+right?
+
+-- 
+  Bruce Momjian                        |  http://www.op.net/~candle
+  maillist@candle.pha.pa.us            |  (610) 853-3000
+  +  If your life is a hard drive,     |  830 Blythe Avenue
+  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
+
+
+From owner-pgsql-hackers@hub.org Sat Mar 20 12:01:44 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA24855
+	for <maillist@candle.pha.pa.us>; Sat, 20 Mar 1999 12:01:43 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id LAA11985 for <maillist@candle.pha.pa.us>; Sat, 20 Mar 1999 11:58:48 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.2/8.9.1) with SMTP id LAA12367;
+	Sat, 20 Mar 1999 11:57:17 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sat, 20 Mar 1999 11:55:22 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.2/8.9.1) id LAA12026
+	for pgsql-hackers-outgoing; Sat, 20 Mar 1999 11:55:17 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6])
+	by hub.org (8.9.2/8.9.1) with ESMTP id LAA11871
+	for <pgsql-hackers@postgreSQL.org>; Sat, 20 Mar 1999 11:54:57 -0500 (EST)
+	(envelope-from tgl@sss.pgh.pa.us)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id LAA28068;
+	Sat, 20 Mar 1999 11:48:58 -0500 (EST)
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+cc: riedel+@CMU.EDU (Erik Riedel), pgsql-hackers@postgreSQL.org
+Subject: Re: [HACKERS] aggregation memory leak and fix 
+In-reply-to: Your message of Fri, 19 Mar 1999 21:33:33 -0500 (EST) 
+             <199903200233.VAA11816@candle.pha.pa.us> 
+Date: Sat, 20 Mar 1999 11:48:58 -0500
+Message-ID: <28066.921948538@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: ROr
+
+Bruce Momjian <maillist@candle.pha.pa.us> writes:
+> My only quick solution would seem to be to add a new "expression" memory
+> context, that can be cleared after every tuple is processed, clearing
+> out temporary values allocated inside an expression.
+
+Right, this whole problem of growing backend memory use during a large
+SELECT (or COPY, or probably a few other things) is one of the things
+that we were talking about addressing by revising the memory management
+structure.
+
+I think what we want inside the executor is a distinction between
+storage that must live to the end of the statement and storage that is
+only needed while processing the current tuple.  The second kind of
+storage would go into a separate context that gets flushed every so
+often.  (It could be every tuple, or every dozen or hundred tuples
+depending on what seems the best tradeoff of cycles against memory
+usage.)
+
+I'm not sure that just two contexts is enough, either.  For example in
+	SELECT field1, SUM(field2) GROUP BY field1;
+the working memory for the SUM aggregate could not be released after
+each tuple, but perhaps we don't want it to live for the whole statement
+either --- in that case we'd need a per-group context.  (This particular
+example isn't very convincing, because the same storage for the SUM
+*could* be recycled from group to group.  But I don't know whether it
+actually *is* reused or not.  If fresh storage is palloc'd for each
+instantiation of SUM then we have a per-group leak in this scenario.
+In any case, I'm not sure all aggregate functions have constant memory
+requirements that would let them recycle storage across groups.)
+
+What we need to do is work out what the best set of memory context
+definitions is, and then decide on a strategy for making sure that
+lower-level routines allocate their return values in the right context.
+It'd be nice if the lower-level routines could still call palloc() and
+not have to worry about this explicitly --- otherwise we'll break not
+only a lot of our own code but perhaps a lot of user code.  (User-
+specific data types and SPI code all use palloc, no?)
+
+I think it is too late to try to fix this for 6.5, but it ought to be a
+top priority for 6.6.
+
+			regards, tom lane
+
+
+From tgl@sss.pgh.pa.us Sun Mar 21 16:01:46 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA00139
+	for <maillist@candle.pha.pa.us>; Sun, 21 Mar 1999 16:01:45 -0500 (EST)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id PAA27737 for <maillist@candle.pha.pa.us>; Sun, 21 Mar 1999 15:52:38 -0500 (EST)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id PAA14946;
+	Sun, 21 Mar 1999 15:50:20 -0500 (EST)
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+cc: pgsql-hackers@postgreSQL.org
+Subject: Re: [HACKERS] aggregation memory leak and fix 
+In-reply-to: Your message of Sun, 21 Mar 1999 14:20:39 -0500 (EST) 
+             <199903211920.OAA28744@candle.pha.pa.us> 
+Date: Sun, 21 Mar 1999 15:50:20 -0500
+Message-ID: <14944.922049420@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Status: ROr
+
+Bruce Momjian <maillist@candle.pha.pa.us> writes:
+>> What we need to do is work out what the best set of memory context
+>> definitions is, and then decide on a strategy for making sure that
+>> lower-level routines allocate their return values in the right context.
+
+> Let's suppose that we want to free all the memory used as expression
+> intermediate values after each row is processed.
+> It is my understanding that all these are created in utils/adt/*.c
+> files, and that the entry point to all those functions via
+> fmgr()/fmgr_c().
+
+That's probably the bulk of the specific calls of palloc().  Someone
+(Jan?) did a scan of the code a while ago looking for palloc() calls,
+and there aren't that many outside of the data-type-specific functions.
+But we'd have to look individually at all the ones that are elsewhere.
+
+> So, if we go into an expression memory context before calling
+> fmgr/fmgr_c in the executor, and return to the normal context after the
+> function call, all our intermediates are trapped in the expression
+> memory context.
+
+OK, so you're saying we leave the data-type-specific functions as is
+(calling palloc() to allocate their result areas), and make each call
+site specifically responsible for setting the context that palloc() will
+allocate from?  That could work, I think.  We'd need to see what side
+effects it'd have on other uses of palloc().
+
+What we'd probably want is to use a stack discipline for the current
+palloc-target memory context: when you set the context, you get back the
+ID of the old context, and you are supposed to restore that old context
+before returning.
+
+> At the end of each row, we just free the expression memory context.  In
+> almost all cases, the data is stored in tuples, and we can free it.  In
+> a few cases like aggregates, we have to save off the value we need to
+> keep before freeing the expression context.
+
+Actually, nodeAgg would just have to set an appropriate context before
+calling fmgr to execute the aggregate's transition functions, and then
+it wouldn't need an extra copy step.  The results would come back in the
+right context already.
+
+> In fact, you could even optimize the cleanup to only do free'ing if
+> some expression memory was allocated.  In most cases, it is not.
+
+Jan's stuff should already fall through pretty quickly if there's
+nothing in the context, I think.  Note that what we want to do between
+tuples is a "context clear" of the expression context, not a "context
+delete" and then "context create" a new expression context.  Context
+clear should be a pretty quick no-op if nothing's been allocated in that
+context...
+
+> In fact the nodeAgg.c patch that I backed out attempted to do that,
+> though because there wasn't code that checked if the Datum was
+> pg_type.typbyval, it didn't work 100%.
+
+Right.  But if we approach it this way (clear the context at appropriate
+times) rather than thinking in terms of explicitly pfree'ing individual
+objects, life gets much simpler.  Also, if we insist on being able to
+pfree individual objects inside a context, we can't use Jan's faster
+allocator!  Remember, the reason it is faster and lower overhead is that
+it doesn't keep track of individual objects, only pools.
+
+I'd like to see us head in the direction of removing most of the
+explicit pfree calls that exist now, and instead rely on clearing
+memory contexts at appropriate times in order to manage memory.
+The fewer places where we need pfree, the more contexts can be run
+with the low-overhead space allocator.  Also, the fewer explicit
+pfrees we need, the simpler and more reliable the code gets.
+
+			regards, tom lane
+
+From owner-pgsql-hackers@hub.org Sun Mar 21 16:01:49 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA00149
+	for <maillist@candle.pha.pa.us>; Sun, 21 Mar 1999 16:01:48 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id PAA27950 for <maillist@candle.pha.pa.us>; Sun, 21 Mar 1999 15:56:07 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.2/8.9.1) with SMTP id PAA39413;
+	Sun, 21 Mar 1999 15:54:51 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 21 Mar 1999 15:54:31 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.2/8.9.1) id PAA39249
+	for pgsql-hackers-outgoing; Sun, 21 Mar 1999 15:54:27 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6])
+	by hub.org (8.9.2/8.9.1) with ESMTP id PAA39235
+	for <pgsql-hackers@postgreSQL.org>; Sun, 21 Mar 1999 15:54:21 -0500 (EST)
+	(envelope-from tgl@sss.pgh.pa.us)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id PAA14946;
+	Sun, 21 Mar 1999 15:50:20 -0500 (EST)
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+cc: pgsql-hackers@postgreSQL.org
+Subject: Re: [HACKERS] aggregation memory leak and fix 
+In-reply-to: Your message of Sun, 21 Mar 1999 14:20:39 -0500 (EST) 
+             <199903211920.OAA28744@candle.pha.pa.us> 
+Date: Sun, 21 Mar 1999 15:50:20 -0500
+Message-ID: <14944.922049420@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Bruce Momjian <maillist@candle.pha.pa.us> writes:
+>> What we need to do is work out what the best set of memory context
+>> definitions is, and then decide on a strategy for making sure that
+>> lower-level routines allocate their return values in the right context.
+
+> Let's suppose that we want to free all the memory used as expression
+> intermediate values after each row is processed.
+> It is my understanding that all these are created in utils/adt/*.c
+> files, and that the entry point to all those functions via
+> fmgr()/fmgr_c().
+
+That's probably the bulk of the specific calls of palloc().  Someone
+(Jan?) did a scan of the code a while ago looking for palloc() calls,
+and there aren't that many outside of the data-type-specific functions.
+But we'd have to look individually at all the ones that are elsewhere.
+
+> So, if we go into an expression memory context before calling
+> fmgr/fmgr_c in the executor, and return to the normal context after the
+> function call, all our intermediates are trapped in the expression
+> memory context.
+
+OK, so you're saying we leave the data-type-specific functions as is
+(calling palloc() to allocate their result areas), and make each call
+site specifically responsible for setting the context that palloc() will
+allocate from?  That could work, I think.  We'd need to see what side
+effects it'd have on other uses of palloc().
+
+What we'd probably want is to use a stack discipline for the current
+palloc-target memory context: when you set the context, you get back the
+ID of the old context, and you are supposed to restore that old context
+before returning.
+
+> At the end of each row, we just free the expression memory context.  In
+> almost all cases, the data is stored in tuples, and we can free it.  In
+> a few cases like aggregates, we have to save off the value we need to
+> keep before freeing the expression context.
+
+Actually, nodeAgg would just have to set an appropriate context before
+calling fmgr to execute the aggregate's transition functions, and then
+it wouldn't need an extra copy step.  The results would come back in the
+right context already.
+
+> In fact, you could even optimize the cleanup to only do free'ing if
+> some expression memory was allocated.  In most cases, it is not.
+
+Jan's stuff should already fall through pretty quickly if there's
+nothing in the context, I think.  Note that what we want to do between
+tuples is a "context clear" of the expression context, not a "context
+delete" and then "context create" a new expression context.  Context
+clear should be a pretty quick no-op if nothing's been allocated in that
+context...
+
+> In fact the nodeAgg.c patch that I backed out attempted to do that,
+> though because there wasn't code that checked if the Datum was
+> pg_type.typbyval, it didn't work 100%.
+
+Right.  But if we approach it this way (clear the context at appropriate
+times) rather than thinking in terms of explicitly pfree'ing individual
+objects, life gets much simpler.  Also, if we insist on being able to
+pfree individual objects inside a context, we can't use Jan's faster
+allocator!  Remember, the reason it is faster and lower overhead is that
+it doesn't keep track of individual objects, only pools.
+
+I'd like to see us head in the direction of removing most of the
+explicit pfree calls that exist now, and instead rely on clearing
+memory contexts at appropriate times in order to manage memory.
+The fewer places where we need pfree, the more contexts can be run
+with the low-overhead space allocator.  Also, the fewer explicit
+pfrees we need, the simpler and more reliable the code gets.
+
+			regards, tom lane
+
+
+From owner-pgsql-hackers@hub.org Wed Mar 24 19:10:53 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA00906
+	for <maillist@candle.pha.pa.us>; Wed, 24 Mar 1999 19:10:52 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id NAA24258 for <maillist@candle.pha.pa.us>; Wed, 24 Mar 1999 13:09:47 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.2/8.9.1) with SMTP id NAA60743;
+	Wed, 24 Mar 1999 13:07:26 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 24 Mar 1999 13:06:47 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.2/8.9.1) id NAA60556
+	for pgsql-hackers-outgoing; Wed, 24 Mar 1999 13:06:43 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from po7.andrew.cmu.edu (PO7.ANDREW.CMU.EDU [128.2.10.107])
+	by hub.org (8.9.2/8.9.1) with ESMTP id NAA60540
+	for <pgsql-hackers@postgreSQL.org>; Wed, 24 Mar 1999 13:06:25 -0500 (EST)
+	(envelope-from er1p+@andrew.cmu.edu)
+Received: (from postman@localhost) by po7.andrew.cmu.edu (8.8.5/8.8.2) id NAA06323; Wed, 24 Mar 1999 13:06:16 -0500 (EST)
+Received: via switchmail; Wed, 24 Mar 1999 13:06:16 -0500 (EST)
+Received: from cloudy.me.cmu.edu via qmail
+          ID </afs/andrew.cmu.edu/service/mailqs/q001/QF.cqyGa::00gNtI0TZtD>;
+          Wed, 24 Mar 1999 13:06:02 -0500 (EST)
+Received: from cloudy.me.cmu.edu via qmail
+          ID </afs/andrew.cmu.edu/usr2/er1p/.Outgoing/QF.sqyGa8G00gNtImTGBe>;
+          Wed, 24 Mar 1999 13:06:00 -0500 (EST)
+Received: from mms.4.60.Jun.27.1996.03.05.56.sun4.41.EzMail.2.0.CUILIB.3.45.SNAP.NOT.LINKED.cloudy.me.cmu.edu.sun4m.412
+          via MS.5.6.cloudy.me.cmu.edu.sun4_41;
+          Wed, 24 Mar 1999 13:05:58 -0500 (EST)
+Message-ID: <QqyGa6600gNtMmTG1o@andrew.cmu.edu>
+Date: Wed, 24 Mar 1999 13:05:58 -0500 (EST)
+From: Erik Riedel <riedel+@CMU.EDU>
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+Subject: Re: [HACKERS] aggregation memory leak and fix
+Cc: pgsql-hackers@postgreSQL.org
+In-Reply-To: <199903240611.BAA01206@candle.pha.pa.us>
+References: <199903240611.BAA01206@candle.pha.pa.us>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+
+> I am interested to see if it fixes the expression leak you saw.  I have
+> not committed this yet.  I want to look at it some more.
+>  
+I'm afraid that this doesn't seem to have any effect on my query.
+
+Looking at your code, I think the problem is that most of the
+allocations in my query are on the top part of the if statement that
+you modified (i.e. the == SQLlanguageId part).  Below is a snippet of
+a trace from my query, with approximate line numbers for execQual.c
+with your patch applied:
+
+(execQual) language == SQLlanguageId (execQual.c:757)
+(execQual) execute postquel_function (execQual.c:759)
+(mcxt) MemoryContextAlloc 32 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 16 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 528 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 56 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 88 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 24 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 8 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 65 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 48 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 8 bytes in ** Blank Portal **-heap
+(execQual) else clause NOT SQLlanguageId (execQual.c:822)
+(execQual) install qual memory context (execQual.c:858)
+(execQual) exit qual context (execQual.c:862)
+(mcxt) MemoryContextAlloc 60 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 16 bytes
+(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 64 bytes
+(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 64 bytes
+(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 528 bytes
+(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 16 bytes
+(execQual) return from postquel_function (execQual.c:764)
+(execQual) return from ExecEvalFuncArgs (execQual.c:792)
+(execQual) else clause NOT SQLlanguageId (execQual.c:822)
+(execQual) install qual memory context (execQual.c:858)
+(execQual) exit qual context (execQual.c:862)
+(mcxt) MemoryContextAlloc 108 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 108 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 128 bytes
+(execQual) else clause NOT SQLlanguageId (execQual.c:822)
+(execQual) install qual memory context (execQual.c:858)
+(mcxt) MemoryContextAlloc 8 bytes in <Qual manager>-heap
+(execQual) exit qual context (execQual.c:862)
+
+<pattern repeats>
+
+(execQual) language == SQLlanguageId (execQual.c:757)
+(execQual) execute postquel_function (execQual.c:759)
+(mcxt) MemoryContextAlloc 32 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 16 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 528 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 56 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 88 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 24 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 8 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 65 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 48 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 8 bytes in ** Blank Portal **-heap
+(execQual) else clause NOT SQLlanguageId (execQual.c:822)
+(execQual) install qual memory context (execQual.c:858)
+(execQual) exit qual context (execQual.c:862)
+(mcxt) MemoryContextAlloc 60 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 16 bytes
+(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 64 bytes
+(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 64 bytes
+(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 528 bytes
+(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 16 bytes
+(execQual) return from postquel_function (execQual.c:764)
+(execQual) return from ExecEvalFuncArgs (execQual.c:792)
+(execQual) else clause NOT SQLlanguageId (execQual.c:822)
+(execQual) install qual memory context (execQual.c:858)
+(execQual) exit qual context (execQual.c:862)
+(mcxt) MemoryContextAlloc 108 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextAlloc 108 bytes in ** Blank Portal **-heap
+(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 128 bytes
+(execQual) else clause NOT SQLlanguageId (execQual.c:822)
+(execQual) install qual memory context (execQual.c:858)
+(mcxt) MemoryContextAlloc 8 bytes in <Qual manager>-heap
+(execQual) exit qual context (execQual.c:862)
+
+
+the MemoryContext lines give the name of the portal where each
+allocation is happening - you see that your Qual manager only captures
+a very small number (one) of the allocations, the rest are in the
+upper part of the if statement.
+
+Note that I also placed a printf next to your EndPortalAllocMode() and
+StartPortalAllocMode() fix in ExecQual() - I believe this is what is
+supposed to clear the portal and free the memory - and that printf
+never appears in the above trace.
+
+Sorry if the trace is a little confusing, but I hope that it helps you
+zero in.
+
+Erik
+
+
+
+
+
+
+
+From owner-pgsql-hackers@hub.org Sat May 15 23:13:50 1999
+Received: from hub.org (hub.org [209.167.229.1])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA29144
+	for <maillist@candle.pha.pa.us>; Sat, 15 May 1999 23:13:49 -0400 (EDT)
+Received: from hub.org (hub.org [209.167.229.1])
+	by hub.org (8.9.3/8.9.3) with ESMTP id XAA25173;
+	Sat, 15 May 1999 23:11:03 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sat, 15 May 1999 23:10:29 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.9.3/8.9.3) id XAA25111
+	for pgsql-hackers-outgoing; Sat, 15 May 1999 23:10:27 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-hackers@postgreSQL.org using -f
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6])
+	by hub.org (8.9.3/8.9.3) with ESMTP id XAA25092
+	for <pgsql-hackers@postgreSQL.org>; Sat, 15 May 1999 23:10:22 -0400 (EDT)
+	(envelope-from tgl@sss.pgh.pa.us)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id XAA17752
+	for <pgsql-hackers@postgreSQL.org>; Sat, 15 May 1999 23:09:46 -0400 (EDT)
+To: pgsql-hackers@postgreSQL.org
+Subject: [HACKERS] Memory leaks in relcache
+Date: Sat, 15 May 1999 23:09:46 -0400
+Message-ID: <17750.926824186@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: ROr
+
+I have been looking into why a reference to a nonexistent table, eg
+	INSERT INTO nosuchtable VALUES(1);
+leaks a small amount of memory per occurrence.  What I find is a
+memory leak in the indexscan support.  Specifically,
+RelationGetIndexScan in backend/access/index/genam.c palloc's both
+an IndexScanDesc and some keydata storage.  The IndexScanDesc
+block is eventually pfree'd, at the bottom of CatalogIndexFetchTuple
+in backend/catalog/indexing.c.  But the keydata block is not.
+
+This wouldn't matter so much if the palloc were coming from a
+transaction-local context.  But what we're doing is a lookup in pg_class
+on behalf of RelationBuildDesc in backend/utils/cache/relcache.c, and
+it's done a MemoryContextSwitchTo into the global CacheCxt before
+starting the lookup.  Therefore, the un-pfreed block represents a
+permanent memory leak.
+
+In fact, *every* reference to a relation that is not already present in
+the relcache causes a similar leak.  The error case is just the one that
+is easiest to repeat.  The missing pfree of the keydata block is
+probably causing a bunch of other short-term and long-term leaks too.
+
+It seems to me there are two things to fix here: indexscan ought to
+pfree everything it pallocs, and RelationBuildDesc ought to be warier
+about how much work gets done with CacheCxt as the active palloc
+context.  (Even if indexscan didn't leak anything ordinarily, there's
+still the risk of elog(ERROR) causing an abort before the indexscan code
+gets to clean up.)
+
+Comments?  In particular, where is the cleanest place to add the pfree
+of the keydata block?  I don't especially like the fact that callers
+of index_endscan have to clean up the toplevel scan block; I think that
+ought to happen inside index_endscan.
+
+			regards, tom lane
+
+
--- a/doc/TODO.detail/nulls
+++ b/doc/TODO.detail/nulls
+From owner-pgsql-general@hub.org Fri Oct  9 18:22:09 1998
+Received: from hub.org (majordom@hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA04220
+	for <maillist@candle.pha.pa.us>; Fri, 9 Oct 1998 18:22:08 -0400 (EDT)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.8.8/8.8.8) with SMTP id SAA26960;
+	Fri, 9 Oct 1998 18:18:29 -0400 (EDT)
+	(envelope-from owner-pgsql-general@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 09 Oct 1998 18:18:07 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.8.8/8.8.8) id SAA26917
+	for pgsql-general-outgoing; Fri, 9 Oct 1998 18:18:04 -0400 (EDT)
+	(envelope-from owner-pgsql-general@postgreSQL.org)
+X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-general@postgreSQL.org using -f
+Received: from gecko.statsol.com (gecko.statsol.com [198.11.51.133])
+	by hub.org (8.8.8/8.8.8) with ESMTP id SAA26904
+	for <pgsql-general@postgresql.org>; Fri, 9 Oct 1998 18:17:46 -0400 (EDT)
+	(envelope-from statsol@statsol.com)
+Received: from gecko (gecko [198.11.51.133])
+	by gecko.statsol.com (8.9.0/8.9.0) with SMTP id SAA00557
+	for <pgsql-general@postgresql.org>; Fri, 9 Oct 1998 18:18:00 -0400 (EDT)
+Date: Fri, 9 Oct 1998 18:18:00 -0400 (EDT)
+From: Steve Doliov <statsol@statsol.com>
+X-Sender: statsol@gecko
+To: pgsql-general@postgreSQL.org
+Subject: Re: [GENERAL] Making NULLs visible.
+Message-ID: <Pine.GSO.3.96.981009181716.545B-100000@gecko>
+MIME-Version: 1.0
+Content-Type: TEXT/PLAIN; charset=US-ASCII
+Sender: owner-pgsql-general@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+On Fri, 9 Oct 1998, Bruce Momjian wrote:
+
+> [Charset iso-8859-1 unsupported, filtering to ASCII...]
+> > > Yes, \ always outputs as \\, excepts someone changed it last week, and I
+> > > am requesting a reversal.  Do you like the \N if it is unique?
+> > 
+> > Well, it's certainly clear, but could be confused with \n (newline). Can we
+> > have \0 instead?
+> 
+> Yes, but it is uppercase.  \0 looks like an octal number to me, and I
+> think we even output octals sometimes, don't we?
+> 
+
+my first suggestion may have been hare-brained, but why not just make the
+specifics of the output user-configurable.  So if the user chooses \0, so
+be it, if the user chooses \N so be it, if the user likes NULL so be it.
+but the option would only have one value per database at any given point
+in time.  so database x could use \N on tuesday and NULL on wednesday, but
+database x could never have two references to the characters(s) used to
+represent a null value.
+
+steve
+
+
+
+
+From owner-pgsql-general@hub.org Sun Oct 11 17:31:08 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA20043
+	for <maillist@candle.pha.pa.us>; Sun, 11 Oct 1998 17:31:02 -0400 (EDT)
+Received: from hub.org (majordom@hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id RAA03069 for <maillist@candle.pha.pa.us>; Sun, 11 Oct 1998 17:10:34 -0400 (EDT)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.8.8/8.8.8) with SMTP id QAA10856;
+	Sun, 11 Oct 1998 16:57:34 -0400 (EDT)
+	(envelope-from owner-pgsql-general@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 11 Oct 1998 16:53:35 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.8.8/8.8.8) id QAA10393
+	for pgsql-general-outgoing; Sun, 11 Oct 1998 16:53:34 -0400 (EDT)
+	(envelope-from owner-pgsql-general@postgreSQL.org)
+X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-general@postgreSQL.org using -f
+Received: from mail1.panix.com (mail1.panix.com [166.84.0.212])
+	by hub.org (8.8.8/8.8.8) with ESMTP id QAA10378
+	for <pgsql-general@postgreSQL.org>; Sun, 11 Oct 1998 16:53:28 -0400 (EDT)
+	(envelope-from tomg@admin.nrnet.org)
+Received: from mailhost.nrnet.org (root@mailhost.nrnet.org [166.84.192.39])
+	by mail1.panix.com (8.8.8/8.8.8/PanixM1.3) with ESMTP id QAA16311
+	for <pgsql-general@postgreSQL.org>; Sun, 11 Oct 1998 16:53:24 -0400 (EDT)
+Received: from admin.nrnet.org (uucp@localhost)
+          by mailhost.nrnet.org (8.8.7/8.8.4) with UUCP
+   id QAA16345 for pgsql-general@postgreSQL.org; Sun, 11 Oct 1998 16:28:47 -0400
+Received: from localhost (tomg@localhost)
+	by admin.nrnet.org (8.8.7/8.8.7) with SMTP id QAA11569
+	for <pgsql-general@postgreSQL.org>; Sun, 11 Oct 1998 16:28:41 -0400
+Date: Sun, 11 Oct 1998 16:28:41 -0400 (EDT)
+From: Thomas Good <tomg@admin.nrnet.org>
+To: pgsql-general@postgreSQL.org
+Subject: Re: [GENERAL] Making NULLs visible.
+In-Reply-To: <Pine.GSO.3.96.981009181716.545B-100000@gecko>
+Message-ID: <Pine.LNX.3.96.981011161908.11556A-100000@admin.nrnet.org>
+MIME-Version: 1.0
+Content-Type: TEXT/PLAIN; charset=US-ASCII
+Sender: owner-pgsql-general@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Watching all this go by...as a guy who has to move alot of data
+from legacy dbs to postgres, I've gotten used to \N being a null.
+
+My vote, if I were allowed to cast one, would be to have one null 
+and that would be the COPY command null.  I have no difficulty
+distinguishing a null from a newline...
+
+At the pgsql command prompt I would find seeing \N rather reassuring.
+I've seen alot of these little guys.
+
+    ---------- Sisters of Charity Medical Center ----------
+                   Department of Psychiatry
+                            ----     
+    Thomas Good                          <tomg@q8.nrnet.org>
+    Coordinator, North Richmond C.M.H.C. Information Systems
+    75 Vanderbilt Ave, Quarters 8        Phone: 718-354-5528
+    Staten Island, NY   10304            Fax:   718-354-5056
+
+
+
--- a/doc/TODO.detail/optimizer
+++ b/doc/TODO.detail/optimizer
+From owner-pgsql-hackers@hub.org Mon Mar 22 18:43:41 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA23978
+	for <maillist@candle.pha.pa.us>; Mon, 22 Mar 1999 18:43:39 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id SAA06472 for <maillist@candle.pha.pa.us>; Mon, 22 Mar 1999 18:36:44 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.2/8.9.1) with SMTP id SAA92604;
+	Mon, 22 Mar 1999 18:34:23 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 22 Mar 1999 18:33:50 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.2/8.9.1) id SAA92469
+	for pgsql-hackers-outgoing; Mon, 22 Mar 1999 18:33:47 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from po8.andrew.cmu.edu (PO8.ANDREW.CMU.EDU [128.2.10.108])
+	by hub.org (8.9.2/8.9.1) with ESMTP id SAA92456
+	for <pgsql-hackers@postgresql.org>; Mon, 22 Mar 1999 18:33:41 -0500 (EST)
+	(envelope-from er1p+@andrew.cmu.edu)
+Received: (from postman@localhost) by po8.andrew.cmu.edu (8.8.5/8.8.2) id SAA12894 for pgsql-hackers@postgresql.org; Mon, 22 Mar 1999 18:33:38 -0500 (EST)
+Received: via switchmail; Mon, 22 Mar 1999 18:33:38 -0500 (EST)
+Received: from cloudy.me.cmu.edu via qmail
+          ID </afs/andrew.cmu.edu/service/mailqs/q007/QF.Aqxh7Lu00gNtQ0TZE5>;
+          Mon, 22 Mar 1999 18:27:20 -0500 (EST)
+Received: from cloudy.me.cmu.edu via qmail
+          ID </afs/andrew.cmu.edu/usr2/er1p/.Outgoing/QF.Uqxh7JS00gNtMmTJFk>;
+          Mon, 22 Mar 1999 18:27:17 -0500 (EST)
+Received: from mms.4.60.Jun.27.1996.03.05.56.sun4.41.EzMail.2.0.CUILIB.3.45.SNAP.NOT.LINKED.cloudy.me.cmu.edu.sun4m.412
+          via MS.5.6.cloudy.me.cmu.edu.sun4_41;
+          Mon, 22 Mar 1999 18:27:15 -0500 (EST)
+Message-ID: <sqxh7H_00gNtAmTJ5Q@andrew.cmu.edu>
+Date: Mon, 22 Mar 1999 18:27:15 -0500 (EST)
+From: Erik Riedel <riedel+@CMU.EDU>
+To: pgsql-hackers@postgreSQL.org
+Subject: [HACKERS] optimizer and type question
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+
+[last week aggregation, this week, the optimizer]
+
+I have a somewhat general optimizer question/problem that I would like
+to get some input on - i.e. I'd like to know what is "supposed" to
+work here and what I should be expecting.  Sadly, I think the patch
+for this is more involved than my last message.
+
+Using my favorite table these days:
+
+Table    = lineitem
+------------------------+----------------------------------+-------+
+|              Field     |              Type                | Length|
+------------------------+----------------------------------+-------+
+| l_orderkey             | int4 not null                    |     4 |
+| l_partkey              | int4 not null                    |     4 |
+| l_suppkey              | int4 not null                    |     4 |
+| l_linenumber           | int4 not null                    |     4 |
+| l_quantity             | float4 not null                  |     4 |
+| l_extendedprice        | float4 not null                  |     4 |
+| l_discount             | float4 not null                  |     4 |
+| l_tax                  | float4 not null                  |     4 |
+| l_returnflag           | char() not null                  |     1 |
+| l_linestatus           | char() not null                  |     1 |
+| l_shipdate             | date                             |     4 |
+| l_commitdate           | date                             |     4 |
+| l_receiptdate          | date                             |     4 |
+| l_shipinstruct         | char() not null                  |    25 |
+| l_shipmode             | char() not null                  |    10 |
+| l_comment              | char() not null                  |    44 |
+------------------------+----------------------------------+-------+
+Index:    lineitem_index_
+
+and the query:
+
+--
+-- Query 1
+--
+explain select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, 
+sum(l_extendedprice) as sum_base_price, 
+sum(l_extendedprice*(1-l_discount)) as sum_disc_price, 
+sum(l_extendedprice*(1-l_discount)*(1+l_tax)) as sum_charge, 
+avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, 
+avg(l_discount) as avg_disc, count(*) as count_order 
+from lineitem 
+where l_shipdate <= '1998-09-02'::date 
+group by l_returnflag, l_linestatus 
+order by l_returnflag, l_linestatus;
+
+
+note that I have eliminated the date calculation in my query of last
+week and manually replaced it with a constant (since this wasn't
+happening automatically - but let's not worry about that for now).
+And this is only an explain, we care about the optimizer.  So we get:
+
+Sort  (cost=34467.88 size=0 width=0)
+ ->  Aggregate  (cost=34467.88 size=0 width=0)
+   ->  Group  (cost=34467.88 size=0 width=0)
+     ->  Sort  (cost=34467.88 size=0 width=0)
+       ->  Seq Scan on lineitem  (cost=34467.88 size=200191 width=44)
+
+so let's think about the selectivity that is being chosen for the
+seq scan (the where l_shipdate <= '1998-09-02').
+
+Turns out the optimizer is choosing "33%", even though the real answer
+is somewhere in 90+% (that's how the query is designed).  So, why does
+it do that?
+
+Turns out that selectivity in this case is determined via
+plancat::restriction_selectivity() which calls into functionOID = 103
+(intltsel) for operatorOID = 1096 (date "<=") on relation OID = 18663
+(my lineitem).
+
+This all follows because of the description of 1096 (date "<=") in
+pg_operator.  Looking at local1_template1.bki.source near line 1754
+shows:
+
+insert OID = 1096 ( "<=" PGUID 0 <...> date_le intltsel intltjoinsel )
+
+where we see that indeed, it thinks "intltsel" is the right function
+to use for "oprrest" in the case of dates.
+
+Question 1 - is intltsel the right thing for selectivity on dates?
+
+Hope someone is still with me.
+
+So now we're running selfuncs::intltsel() where we make a further call
+to selfuncs::gethilokey().  The job of gethilokey is to determine the
+min and max values of a particular attribute in the table, which will
+then be used with the constant in my where clause to estimate the
+selectivity.  It is going to search the pg_statistic relation with
+three key values:
+
+Anum_pg_statistic_starelid     18663  (lineitem)
+Anum_pg_statistic_staattnum       11  (l_shipdate)
+Anum_pg_statistic_staop         1096  (date "<=")
+
+this finds no tuples in pg_statistic.  Why is that?  The only nearby
+tuple in pg_statistic is:
+
+starelid|staattnum|staop|stalokey        |stahikey       
+--------+---------+-----+----------------+----------------
+   18663|       11|    0|01-02-1992      |12-01-1998
+
+and the reason the query doesn't match anything?  Because 1096 != 0.
+But why is it 0 in pg_statistic?  Statistics are determined near line
+1844 in vacuum.c (assuming a 'vacuum analyze' run at some point)
+
+             i = 0;
+             values[i++] = (Datum) relid;            /* 1 */
+             values[i++] = (Datum) attp->attnum; /* 2 */
+====>        values[i++] = (Datum) InvalidOid;       /* 3 */
+             fmgr_info(stats->outfunc, &out_function);
+             out_string = <...min...>
+             values[i++] = (Datum) fmgr(F_TEXTIN, out_string);
+             pfree(out_string);
+             out_string = <...max...>
+             values[i++] = (Datum) fmgr(F_TEXTIN, out_string);
+             pfree(out_string);
+             stup = heap_formtuple(sd->rd_att, values, nulls);
+
+the "offending" line is setting the staop to InvalidOid (i.e. 0).
+
+Question 2 - is this right?  Is the intent for 0 to serve as a
+"wildcard", or should it be inserting an entry for each operation
+individually?
+
+In the case of "wildcard" then gethilokey() should allow a match for 
+
+Anum_pg_statistic_staop         0
+
+instead of requiring the more restrictive 1096.  In the current code,
+what happens next is gethilokey() returns "not found" and intltsel()
+returns the default 1/3 which I see in the resultant query plan (size
+= 200191 is 1/3 of the number of lineitem tuples).
+
+Question 3 - is there any inherent reason it couldn't get this right?
+The statistic is in the table 1992 to 1998, so the '1998-09-02' date
+should be 90-some% selectivity, a much better guess than 33%.
+
+Doesn't make a difference for this particular query, of course,
+because the seq scan must proceed anyhow, but it could easily affect
+other queries where selectivities matter (and it affects the
+modifications I am trying to test in the optimizer to be "smarter"
+about selectivities - my overall context is to understand/improve the
+behavior that the underlying storage system sees from queries like this).
+
+OK, so let's say we treat 0 as a "wildcard" and stop checking for
+1096.  Not we let gethilokey() return the two dates from the statistic
+table.  The immediate next thing that intltsel() does, near lines 122
+in selfuncs.c is call atol() on the strings from gethilokey().  And
+guess what it comes up with?
+
+low = 1
+high = 12
+
+because it calls atol() on '01-02-1992' and '12-01-1998'.  This
+clearly isn't right, it should get some large integer that includes
+the year and day in the result.  Then it should compare reasonably
+with my constant from the where clause and give a decent selectivity
+value.  This leads to a re-visit of Question 1.
+
+Question 4 - should date "<=" use a dateltsel() function instead of
+intltsel() as oprrest?
+
+If anyone is still with me, could you tell me if this makes sense, or
+if there is some other location where the appropriate type conversion
+could take place so that intltsel() gets something reasonable when it
+does the atol() calls?
+
+Could someone also give me a sense for how far out-of-whack the whole
+current selectivity-handling structure is?  It seems that most of the
+operators in pg_operator actually use intltsel() and would have
+type-specific problems like that described.  Or is the problem in the
+way attribute values are stored in pg_statistic by vacuum analyze?  Or
+is there another layer where type conversion belongs?
+
+Phew.  Enough typing, hope someone can follow this and address at
+least some of the questions.
+
+Thanks.
+
+Erik Riedel
+Carnegie Mellon University
+www.cs.cmu.edu/~riedel
+
+
+
+From owner-pgsql-hackers@hub.org Mon Mar 22 20:31:11 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA00802
+	for <maillist@candle.pha.pa.us>; Mon, 22 Mar 1999 20:31:09 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id UAA13231 for <maillist@candle.pha.pa.us>; Mon, 22 Mar 1999 20:15:20 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.2/8.9.1) with SMTP id UAA01981;
+	Mon, 22 Mar 1999 20:14:04 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 22 Mar 1999 20:13:32 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.2/8.9.1) id UAA01835
+	for pgsql-hackers-outgoing; Mon, 22 Mar 1999 20:13:28 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6])
+	by hub.org (8.9.2/8.9.1) with ESMTP id UAA01822
+	for <pgsql-hackers@postgreSQL.org>; Mon, 22 Mar 1999 20:13:21 -0500 (EST)
+	(envelope-from tgl@sss.pgh.pa.us)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id UAA23294;
+	Mon, 22 Mar 1999 20:12:43 -0500 (EST)
+To: Erik Riedel <riedel+@CMU.EDU>
+cc: pgsql-hackers@postgreSQL.org
+Subject: Re: [HACKERS] optimizer and type question 
+In-reply-to: Your message of Mon, 22 Mar 1999 18:27:15 -0500 (EST) 
+             <sqxh7H_00gNtAmTJ5Q@andrew.cmu.edu> 
+Date: Mon, 22 Mar 1999 20:12:43 -0500
+Message-ID: <23292.922151563@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: ROr
+
+Erik Riedel <riedel+@CMU.EDU> writes:
+> [ optimizer doesn't find relevant pg_statistic entry ]
+
+It's clearly a bug that the selectivity code is not finding this tuple.
+If your analysis is correct, then selectivity estimation has *never*
+worked properly, or at least not in recent memory :-(.  Yipes.
+Bruce and I found a bunch of other problems in the optimizer recently,
+so it doesn't faze me to assume that this is broken too.
+
+> the "offending" line is setting the staop to InvalidOid (i.e. 0).
+> Question 2 - is this right?  Is the intent for 0 to serve as a
+> "wildcard",
+
+My thought is that what the staop column ought to be is the OID of the
+comparison function that was used to determine the sort order of the
+column.  Without a sort op the lowest and highest keys in the column are
+not well defined, so it makes no sense to assert "these are the lowest
+and highest values" without providing the sort op that determined that.
+(For sufficiently complex data types one could reasonably have multiple
+ordering operators.  A crude example is sorting on "circumference" and
+"area" for polygons.)  But typically the sort op will be the "<"
+operator for the column data type.
+
+So, the vacuum code is definitely broken --- it's not storing the sort
+op that it used.  The code in gethilokey might be broken too, depending
+on how it is producing the operator it's trying to match against the
+tuple.  For example, if the actual operator in the query is any of
+< <= > >= on int4, then int4lt ought to be used to probe the pg_statistic
+table.  I'm not sure if we have adequate info in pg_operator or pg_type
+to let the optimizer code determine the right thing to probe with :-(
+
+> The immediate next thing that intltsel() does, near lines 122
+> in selfuncs.c is call atol() on the strings from gethilokey().  And
+> guess what it comes up with?
+> low = 1
+> high = 12
+> because it calls atol() on '01-02-1992' and '12-01-1998'.  This
+> clearly isn't right, it should get some large integer that includes
+> the year and day in the result.  Then it should compare reasonably
+> with my constant from the where clause and give a decent selectivity
+> value.  This leads to a re-visit of Question 1.
+> Question 4 - should date "<=" use a dateltsel() function instead of
+> intltsel() as oprrest?
+
+This is clearly busted as well.  I'm not sure that creating dateltsel()
+is the right fix, however, because if you go down that path then every
+single datatype needs its own selectivity function; that's more than we
+need.
+
+What we really want here is to be able to map datatype values into
+some sort of numeric range so that we can compute what fraction of the
+low-key-to-high-key range is on each side of the probe value (the
+constant taken from the query).  This general concept will apply to
+many scalar types, so what we want is a type-specific mapping function
+and a less-specific fraction-computing-function.  Offhand I'd say that
+we want intltsel() and floatltsel(), plus conversion routines that can
+produce either int4 or float8 from a data type as seems appropriate.
+Anything that couldn't map to one or the other would have to supply its
+own selectivity function.
+
+> Or is the problem in the
+> way attribute values are stored in pg_statistic by vacuum analyze?
+
+Looks like it converts the low and high values to text and stores them
+that way.  Ugly as can be :-( but I'm not sure there is a good
+alternative.  We have no "wild card" column type AFAIK, which is what
+these columns of pg_statistic would have to be to allow storage of
+unconverted min and max values.
+
+I think you've found a can of worms here.  Congratulations ;-)
+
+			regards, tom lane
+
+
+From owner-pgsql-hackers@hub.org Mon Mar 22 23:31:00 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA03384
+	for <maillist@candle.pha.pa.us>; Mon, 22 Mar 1999 23:30:58 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id XAA25586 for <maillist@candle.pha.pa.us>; Mon, 22 Mar 1999 23:18:25 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.2/8.9.1) with SMTP id XAA17955;
+	Mon, 22 Mar 1999 23:17:24 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 22 Mar 1999 23:16:49 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.2/8.9.1) id XAA17764
+	for pgsql-hackers-outgoing; Mon, 22 Mar 1999 23:16:46 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from po8.andrew.cmu.edu (PO8.ANDREW.CMU.EDU [128.2.10.108])
+	by hub.org (8.9.2/8.9.1) with ESMTP id XAA17745
+	for <pgsql-hackers@postgreSQL.org>; Mon, 22 Mar 1999 23:16:39 -0500 (EST)
+	(envelope-from er1p+@andrew.cmu.edu)
+Received: (from postman@localhost) by po8.andrew.cmu.edu (8.8.5/8.8.2) id XAA04273; Mon, 22 Mar 1999 23:16:37 -0500 (EST)
+Received: via switchmail; Mon, 22 Mar 1999 23:16:37 -0500 (EST)
+Received: from hazy.adsl.net.cmu.edu via qmail
+          ID </afs/andrew.cmu.edu/service/mailqs/q000/QF.kqxlJ:S00anI00p040>;
+          Mon, 22 Mar 1999 23:15:09 -0500 (EST)
+Received: from hazy.adsl.net.cmu.edu via qmail
+          ID </afs/andrew.cmu.edu/usr2/er1p/.Outgoing/QF.MqxlJ3q00anI01hKE0>;
+          Mon, 22 Mar 1999 23:15:00 -0500 (EST)
+Received: from mms.4.60.Jun.27.1996.03.02.53.sun4.51.EzMail.2.0.CUILIB.3.45.SNAP.NOT.LINKED.hazy.adsl.net.cmu.edu.sun4m.54
+          via MS.5.6.hazy.adsl.net.cmu.edu.sun4_51;
+          Mon, 22 Mar 1999 23:14:55 -0500 (EST)
+Message-ID: <4qxlJ0200anI01hK40@andrew.cmu.edu>
+Date: Mon, 22 Mar 1999 23:14:55 -0500 (EST)
+From: Erik Riedel <riedel+@CMU.EDU>
+To: Tom Lane <tgl@sss.pgh.pa.us>
+Subject: Re: [HACKERS] optimizer and type question
+Cc: pgsql-hackers@postgreSQL.org
+In-Reply-To: <23292.922151563@sss.pgh.pa.us>
+References: <23292.922151563@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: ROr
+
+
+OK, building on your high-level explanation, I am attaching a patch that
+attempts to do something "better" than the current code.  Note that I
+have only tested this with the date type and my particular query.  I
+haven't run it through the regression, so consider it "proof of concept"
+at best.  Although hopefully it will serve my purposes.
+
+> My thought is that what the staop column ought to be is the OID of the
+> comparison function that was used to determine the sort order of the
+> column.  Without a sort op the lowest and highest keys in the column are
+> not well defined, so it makes no sense to assert "these are the lowest
+> and highest values" without providing the sort op that determined that.
+>
+> (For sufficiently complex data types one could reasonably have multiple
+> ordering operators.  A crude example is sorting on "circumference" and
+> "area" for polygons.)  But typically the sort op will be the "<"
+> operator for the column data type.
+>  
+I changed vacuum.c to do exactly that.  oid of the lt sort op.
+
+> So, the vacuum code is definitely broken --- it's not storing the sort
+> op that it used.  The code in gethilokey might be broken too, depending
+> on how it is producing the operator it's trying to match against the
+> tuple.  For example, if the actual operator in the query is any of
+> < <= > >= on int4, then int4lt ought to be used to probe the pg_statistic
+> table.  I'm not sure if we have adequate info in pg_operator or pg_type
+> to let the optimizer code determine the right thing to probe with :-(
+>  
+This indeed seems like a bigger problem.  I thought about somehow using
+type-matching from the sort op and the actual operator in the query - if
+both the left and right type match, then consider them the same for
+purposes of this probe.  That seemed complicated, so I punted in my
+example - it just does the search with relid and attnum and assumes that
+only returns one tuple.  This works in my case (maybe in all cases,
+because of the way vacuum is currently written - ?).
+
+> What we really want here is to be able to map datatype values into
+> some sort of numeric range so that we can compute what fraction of the
+> low-key-to-high-key range is on each side of the probe value (the
+> constant taken from the query).  This general concept will apply to
+> many scalar types, so what we want is a type-specific mapping function
+> and a less-specific fraction-computing-function.  Offhand I'd say that
+> we want intltsel() and floatltsel(), plus conversion routines that can
+> produce either int4 or float8 from a data type as seems appropriate.
+> Anything that couldn't map to one or the other would have to supply its
+> own selectivity function.
+>  
+This is what my example then does.  Uses the stored sort op to get the
+type and then uses typinput to convert from the string to an int4.
+
+Then puts the int4 back into string format because that's what everyone
+was expecting.
+
+It seems to work for my particular query.  I now get:
+
+(selfuncs) gethilokey() obj 18663 attr 11 opid 1096 (ignored)
+(selfuncs) gethilokey() found op 1087 in pg_proc
+(selfuncs) gethilokey() found type 1082 in pg_type
+(selfuncs) gethilokey() going to use 1084 to convert type 1082
+(selfuncs) gethilokey() have low -2921 high -396
+(selfuncs) intltsel() high -396 low -2921 val -486
+(plancat) restriction_selectivity() for func 103 op 1096 rel 18663 attr
+11 const -486 flag 3 returns 0.964356
+NOTICE:  QUERY PLAN:
+
+Sort  (cost=34467.88 size=0 width=0)
+ ->  Aggregate  (cost=34467.88 size=0 width=0)
+  ->  Group  (cost=34467.88 size=0 width=0)
+   ->  Sort  (cost=34467.88 size=0 width=0)
+    ->  Seq Scan on lineitem  (cost=34467.88 size=579166 width=44)
+
+including my printfs, which exist in the patch as well.
+
+Selectivity is now the expected 96% and the size estimate for the seq
+scan is much closer to correct.
+
+Again, not tested with anything besides date, so caveat not-tested.
+
+Hope this helps.
+
+Erik
+
+----------------------[optimizer_fix.sh]------------------------
+
+#! /bin/sh
+# This is a shell archive, meaning:
+# 1. Remove everything above the #! /bin/sh line.
+# 2. Save the resulting text in a file.
+# 3. Execute the file with /bin/sh (not csh) to create:
+#	selfuncs.c.diff
+#	vacuum.c.diff
+# This archive created: Mon Mar 22 22:58:14 1999
+export PATH; PATH=/bin:/usr/bin:$PATH
+if test -f 'selfuncs.c.diff'
+then
+	echo shar: "will not over-write existing file 'selfuncs.c.diff'"
+else
+cat << \SHAR_EOF > 'selfuncs.c.diff'
+***
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/611/src/backend/utils/adt
+/selfuncs.c	Thu Mar 11 23:59:35 1999
+---
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/615/src/backend/utils/adt
+/selfuncs.c	Mon Mar 22 22:57:25 1999
+***************
+*** 32,37 ****
+--- 32,40 ----
+  #include "utils/lsyscache.h"	/* for get_oprrest() */
+  #include "catalog/pg_statistic.h"
+  
+ #include "catalog/pg_proc.h"    /* for Form_pg_proc */
+ #include "catalog/pg_type.h"    /* for Form_pg_type */
+ 
+  /* N is not a valid var/constant or relation id */
+  #define NONVALUE(N)		((N) == -1)
+  
+***************
+*** 103,110 ****
+  				bottom;
+  
+  	result = (float64) palloc(sizeof(float64data));
+! 	if (NONVALUE(attno) || NONVALUE(relid))
+  		*result = 1.0 / 3;
+  	else
+  	{
+  		/* XXX			val = atol(value); */
+--- 106,114 ----
+  				bottom;
+  
+  	result = (float64) palloc(sizeof(float64data));
+! 	if (NONVALUE(attno) || NONVALUE(relid)) {
+  		*result = 1.0 / 3;
+ 	}
+  	else
+  	{
+  		/* XXX			val = atol(value); */
+***************
+*** 117,130 ****
+  		}
+  		high = atol(highchar);
+  		low = atol(lowchar);
+  		if ((flag & SEL_RIGHT && val < low) ||
+  			(!(flag & SEL_RIGHT) && val > high))
+  		{
+  			float32data nvals;
+  
+  			nvals = getattdisbursion(relid, (int) attno);
+! 			if (nvals == 0)
+  				*result = 1.0 / 3.0;
+  			else
+  			{
+  				*result = 3.0 * (float64data) nvals;
+--- 121,136 ----
+  		}
+  		high = atol(highchar);
+  		low = atol(lowchar);
+ 		printf("(selfuncs) intltsel() high %d low %d val %d\n",high,low,val);
+  		if ((flag & SEL_RIGHT && val < low) ||
+  			(!(flag & SEL_RIGHT) && val > high))
+  		{
+  			float32data nvals;
+  
+  			nvals = getattdisbursion(relid, (int) attno);
+! 			if (nvals == 0) {
+  				*result = 1.0 / 3.0;
+ 			}
+  			else
+  			{
+  				*result = 3.0 * (float64data) nvals;
+***************
+*** 336,341 ****
+--- 342,353 ----
+  {
+  	Relation	rel;
+  	HeapScanDesc scan;
+ 	/* this assumes there is only one row in the statistics table for any
+particular */
+ 	/* relid, attnum pair - could be more complicated if staop is also
+used.         */
+ 	/* at the moment, if there are multiple rows, this code ends up
+picking the      */
+ 	/* "first" one                                                       
+   - er1p  */
+ 	/* the actual "ignoring" is done in the call to heap_beginscan()
+below, where    */
+ 	/* we only mention 2 of the 3 keys in this array                     
+   - er1p  */
+  	static ScanKeyData key[3] = {
+  		{0, Anum_pg_statistic_starelid, F_OIDEQ, {0, 0, F_OIDEQ}},
+  		{0, Anum_pg_statistic_staattnum, F_INT2EQ, {0, 0, F_INT2EQ}},
+***************
+*** 344,355 ****
+  	bool		isnull;
+  	HeapTuple	tuple;
+  
+  	rel = heap_openr(StatisticRelationName);
+  
+  	key[0].sk_argument = ObjectIdGetDatum(relid);
+  	key[1].sk_argument = Int16GetDatum((int16) attnum);
+  	key[2].sk_argument = ObjectIdGetDatum(opid);
+! 	scan = heap_beginscan(rel, 0, SnapshotNow, 3, key);
+  	tuple = heap_getnext(scan, 0);
+  	if (!HeapTupleIsValid(tuple))
+  	{
+--- 356,377 ----
+  	bool		isnull;
+  	HeapTuple	tuple;
+  
+ 	HeapTuple tup;
+ 	Form_pg_proc proc;
+ 	Form_pg_type typ;
+ 	Oid which_op;
+ 	Oid which_type;
+ 	int32 low_value;
+ 	int32 high_value;
+ 
+  	rel = heap_openr(StatisticRelationName);
+  
+  	key[0].sk_argument = ObjectIdGetDatum(relid);
+  	key[1].sk_argument = Int16GetDatum((int16) attnum);
+  	key[2].sk_argument = ObjectIdGetDatum(opid);
+! 	printf("(selfuncs) gethilokey() obj %d attr %d opid %d (ignored)\n",
+! 	       key[0].sk_argument,key[1].sk_argument,key[2].sk_argument);
+! 	scan = heap_beginscan(rel, 0, SnapshotNow, 2, key);
+  	tuple = heap_getnext(scan, 0);
+  	if (!HeapTupleIsValid(tuple))
+  	{
+***************
+*** 376,383 ****
+--- 398,461 ----
+  								&isnull));
+  	if (isnull)
+  		elog(DEBUG, "gethilokey: low key is null");
+ 
+  	heap_endscan(scan);
+  	heap_close(rel);
+ 
+ 	/* now we deal with type conversion issues                           
+         */
+ 	/* when intltsel() calls this routine (who knows what other callers
+might do)  */
+ 	/* it assumes that it can call atol() on the strings and then use
+integer      */
+ 	/* comparison from there.  what we are going to do here, then, is try
+to use   */
+ 	/* the type information from Anum_pg_statistic_staop to convert the
+high       */
+ 	/* and low values                                                   
+- er1p    */
+ 
+ 	/* WARNING: this code has only been tested with the date type and has
+NOT      */
+ 	/* been regression tested.  consider it "sample" code of what might
+be the     */
+ 	/* right kind of thing to do                                        
+- er1p    */
+ 
+ 	/* get the 'op' from pg_statistic and look it up in pg_proc */
+ 	which_op = heap_getattr(tuple,
+ 				Anum_pg_statistic_staop,
+ 				RelationGetDescr(rel),
+ 				&isnull);
+ 	if (InvalidOid == which_op) {
+ 	  /* ignore all this stuff, try conversion only if we have a valid staop */
+ 	  /* note that there is an accompanying change to 'vacuum analyze' that  */
+ 	  /* gets this set to something useful.                                  */
+ 	} else {
+ 	  /* staop looks valid, so let's see what we can do about conversion */
+ 	  tup = SearchSysCacheTuple(PROOID, ObjectIdGetDatum(which_op), 0, 0, 0);
+ 	  if (!HeapTupleIsValid(tup)) {
+ 	    elog(ERROR, "selfuncs: unable to find op in pg_proc %d", which_op);
+ 	  }
+ 	  printf("(selfuncs) gethilokey() found op %d in pg_proc\n",which_op);
+ 	  
+ 	  /* use that to determine the type of stahikey and stalokey via pg_type */
+ 	  proc = (Form_pg_proc) GETSTRUCT(tup);
+ 	  which_type = proc->proargtypes[0]; /* XXX - use left and right
+separately? */
+ 	  tup = SearchSysCacheTuple(TYPOID, ObjectIdGetDatum(which_type), 0, 0, 0);
+ 	  if (!HeapTupleIsValid(tup)) {
+ 	    elog(ERROR, "selfuncs: unable to find type in pg_type %d", which_type);
+ 	  }
+ 	  printf("(selfuncs) gethilokey() found type %d in pg_type\n",which_type);
+ 	  
+ 	  /* and use that type to get the conversion function to int4 */
+ 	  typ = (Form_pg_type) GETSTRUCT(tup);
+ 	  printf("(selfuncs) gethilokey() going to use %d to convert type
+%d\n",typ->typinput,which_type);
+ 	  
+ 	  /* and convert the low and high strings */
+ 	  low_value = (int32) fmgr(typ->typinput, *low, -1);
+ 	  high_value = (int32) fmgr(typ->typinput, *high, -1);
+ 	  printf("(selfuncs) gethilokey() have low %d high
+%d\n",low_value,high_value);
+ 	  
+ 	  /* now we have int4's, which we put back into strings because
+that's what out  */
+ 	  /* callers (intltsel() at least) expect                            
+    - er1p */
+ 	  pfree(*low); pfree(*high); /* let's not leak the old strings */
+ 	  *low = int4out(low_value);
+ 	  *high = int4out(high_value);
+ 
+ 	  /* XXX - this probably leaks the two tups we got from
+SearchSysCacheTuple() - er1p */
+ 	}
+  }
+  
+  float64
+SHAR_EOF
+fi
+if test -f 'vacuum.c.diff'
+then
+	echo shar: "will not over-write existing file 'vacuum.c.diff'"
+else
+cat << \SHAR_EOF > 'vacuum.c.diff'
+***
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/611/src/backend/commands/
+vacuum.c	Thu Mar 11 23:59:09 1999
+---
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/615/src/backend/commands/
+vacuum.c	Mon Mar 22 21:23:15 1999
+***************
+*** 1842,1848 ****
+  					i = 0;
+  					values[i++] = (Datum) relid;		/* 1 */
+  					values[i++] = (Datum) attp->attnum; /* 2 */
+! 					values[i++] = (Datum) InvalidOid;	/* 3 */
+  					fmgr_info(stats->outfunc, &out_function);
+  					out_string = (*fmgr_faddr(&out_function)) (stats->min,
+stats->attr->atttypid);
+  					values[i++] = (Datum) fmgr(F_TEXTIN, out_string);
+--- 1842,1848 ----
+  					i = 0;
+  					values[i++] = (Datum) relid;		/* 1 */
+  					values[i++] = (Datum) attp->attnum; /* 2 */
+! 					values[i++] = (Datum) stats->f_cmplt.fn_oid;	/* 3 */ /* get the
+'<' oid, instead of 'invalid' - er1p */
+  					fmgr_info(stats->outfunc, &out_function);
+  					out_string = (*fmgr_faddr(&out_function)) (stats->min,
+stats->attr->atttypid);
+  					values[i++] = (Datum) fmgr(F_TEXTIN, out_string);
+SHAR_EOF
+fi
+exit 0
+#	End of shell archive
+
+
+
+From owner-pgsql-hackers@hub.org Tue Mar 23 12:31:05 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA17491
+	for <maillist@candle.pha.pa.us>; Tue, 23 Mar 1999 12:31:04 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id MAA08839 for <maillist@candle.pha.pa.us>; Tue, 23 Mar 1999 12:08:14 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.2/8.9.1) with SMTP id MAA93649;
+	Tue, 23 Mar 1999 12:04:57 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Tue, 23 Mar 1999 12:03:00 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.2/8.9.1) id MAA93355
+	for pgsql-hackers-outgoing; Tue, 23 Mar 1999 12:02:55 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6])
+	by hub.org (8.9.2/8.9.1) with ESMTP id MAA93336
+	for <pgsql-hackers@postgreSQL.org>; Tue, 23 Mar 1999 12:02:43 -0500 (EST)
+	(envelope-from tgl@sss.pgh.pa.us)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id MAA24455;
+	Tue, 23 Mar 1999 12:01:57 -0500 (EST)
+To: Erik Riedel <riedel+@CMU.EDU>
+cc: pgsql-hackers@postgreSQL.org
+Subject: Re: [HACKERS] optimizer and type question 
+In-reply-to: Your message of Mon, 22 Mar 1999 23:14:55 -0500 (EST) 
+             <4qxlJ0200anI01hK40@andrew.cmu.edu> 
+Date: Tue, 23 Mar 1999 12:01:57 -0500
+Message-ID: <24453.922208517@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Erik Riedel <riedel+@CMU.EDU> writes:
+> OK, building on your high-level explanation, I am attaching a patch that
+> attempts to do something "better" than the current code.  Note that I
+> have only tested this with the date type and my particular query.
+
+Glad to see you working on this.  I don't like the details of your
+patch too much though ;-).  Here are some suggestions for making it
+better.
+
+1. I think just removing staop from the lookup in gethilokey is OK for
+now, though I'm dubious about Bruce's thought that we could delete that
+field entirely.  As you observe, vacuum will not currently put more
+than one tuple for a column into pg_statistic, so we can just do the
+lookup with relid and attno and leave it at that.  But I think we ought
+to leave the field there, with the idea that vacuum might someday
+compute more than one statistic for a data column.  Fixing vacuum to
+put its sort op into the field is a good idea in the meantime.
+
+2. The type conversion you're doing in gethilokey is a mess; I think
+what you ought to make it do is simply the inbound conversion of the
+string from pg_statistic into the internal representation for the
+column's datatype, and return that value as a Datum.  It also needs
+a cleaner success/failure return convention --- this business with
+"n" return is ridiculously type-specific.  Also, the best and easiest
+way to find the type to convert to is to look up the column type in
+the info for the given relid, not search pg_proc with the staop value.
+(I'm not sure that will even work, since there are pg_proc entries
+with wildcard argument types.)
+
+3. The atol() calls currently found in intltsel are a type-specific
+cheat on what is conceptually a two-step process:
+  * Convert the string stored in pg_statistic back to the internal
+    form for the column data type.
+  * Generate a numeric representation of the data value that can be
+    used as an estimate of the range of values in the table.
+The second step is trivial for integers, which may obscure the fact
+that there are two steps involved, but nonetheless there are.  If
+you think about applying selectivity logic to strings, say, it
+becomes clear that the second step is a necessary component of the
+process.  Furthermore, the second step must also be applied to the
+probe value that's being passed into the selectivity operator.
+(The probe value is already in internal form, of course; but it is
+not necessarily in a useful numeric form.)
+
+We can do the first of these steps by applying the appropriate "XXXin"
+conversion function for the column data type, as you have done.  The
+interesting question is how to do the second one.  A really clean
+solution would require adding a column to pg_type that points to a
+function that will do the appropriate conversion.  I'd be inclined to
+make all of these functions return "double" (float8) and just have one
+top-level selectivity routine for all data types that can use
+range-based selectivity logic.
+
+We could probably hack something together that would not use an explicit
+conversion function for each data type, but instead would rely on
+type-specific assumptions inside the selectivity routines.  We'd need many
+more selectivity routines though (at least one for each of int, float4,
+float8, and text data types) so I'm not sure we'd really save any work
+compared to doing it right.
+
+BTW, now that I look at this issue it's real clear that the selectivity
+entries in pg_operator are horribly broken.  The intltsel/intgtsel
+selectivity routines are currently applied to 32 distinct data types:
+
+regression=> select distinct typname,oprleft from pg_operator, pg_type
+regression-> where pg_type.oid = oprleft
+regression-> and oprrest in (103,104);
+typname  |oprleft
+---------+-------
+_aclitem |   1034
+abstime  |    702
+bool     |     16
+box      |    603
+bpchar   |   1042
+char     |     18
+cidr     |    650
+circle   |    718
+date     |   1082
+datetime |   1184
+float4   |    700
+float8   |    701
+inet     |    869
+int2     |     21
+int4     |     23
+int8     |     20
+line     |    628
+lseg     |    601
+macaddr  |    829
+money    |    790
+name     |     19
+numeric  |   1700
+oid      |     26
+oid8     |     30
+path     |    602
+point    |    600
+polygon  |    604
+text     |     25
+time     |   1083
+timespan |   1186
+timestamp|   1296
+varchar  |   1043
+(32 rows)
+
+many of which are very obviously not compatible with integer for *any*
+purpose.  It looks to me like a lot of data types were added to
+pg_operator just by copy-and-paste, without paying attention to whether
+the selectivity routines were actually correct for the data type.
+
+As the code stands today, the bogus entries don't matter because
+gethilokey always fails, so we always get 1/3 as the selectivity
+estimate for any comparison operator (except = and != of course).
+I had actually noticed that fact and assumed that it was supposed
+to work that way :-(.  But, clearly, there is code in here that
+is *trying* to be smarter.
+
+As soon as we fix gethilokey so that it can succeed, we will start
+getting essentially-random selectivity estimates for those data types
+that aren't actually binary-compatible with integer.  That will not do;
+we have to do something about the issue.
+
+			regards, tom lane
+
+
+From tgl@sss.pgh.pa.us Tue Mar 23 12:31:02 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA17484
+	for <maillist@candle.pha.pa.us>; Tue, 23 Mar 1999 12:31:01 -0500 (EST)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id MAA09042 for <maillist@candle.pha.pa.us>; Tue, 23 Mar 1999 12:10:55 -0500 (EST)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id MAA24474;
+	Tue, 23 Mar 1999 12:09:52 -0500 (EST)
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+cc: riedel+@CMU.EDU, pgsql-hackers@postgreSQL.org
+Subject: Re: [HACKERS] optimizer and type question 
+In-reply-to: Your message of Mon, 22 Mar 1999 21:25:45 -0500 (EST) 
+             <199903230225.VAA01641@candle.pha.pa.us> 
+Date: Tue, 23 Mar 1999 12:09:52 -0500
+Message-ID: <24471.922208992@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Status: RO
+
+Bruce Momjian <maillist@candle.pha.pa.us> writes:
+> What we really need is some way to determine how far the requested value
+> is from the min/max values.  With int, we just do (val-min)/(max-min). 
+> That works, but how do we do that for types that don't support division.
+> Strings come to mind in this case.
+
+What I'm envisioning is that we still apply the (val-min)/(max-min)
+logic, but apply it to numeric values that are produced in a
+type-dependent way.
+
+For ints and floats the conversion is trivial, of course.
+
+For strings, the first thing that comes to mind is to return 0 for a
+null string and the value of the first byte for a non-null string.
+This would give you one-part-in-256 selectivity which is plenty good
+enough for what the selectivity code needs to do.  (Actually, it's
+only that good if the strings' first bytes are pretty well spread out.
+If you have a table containing English words, for example, you might
+only get about one part in 26 this way, since the first bytes will
+probably only run from A to Z.  Might be better to use the first two
+characters of the string to compute the selectivity representation.)
+
+In general, you can apply this logic as long as you can come up with
+some numerical approximation to the data type's sorting order.  It
+doesn't have to be exact.
+
+			regards, tom lane
+
+From owner-pgsql-hackers@hub.org Tue Mar 23 12:31:03 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA17488
+	for <maillist@candle.pha.pa.us>; Tue, 23 Mar 1999 12:31:02 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id MAA09987 for <maillist@candle.pha.pa.us>; Tue, 23 Mar 1999 12:21:34 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.2/8.9.1) with SMTP id MAA95155;
+	Tue, 23 Mar 1999 12:18:33 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Tue, 23 Mar 1999 12:17:00 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.2/8.9.1) id MAA94857
+	for pgsql-hackers-outgoing; Tue, 23 Mar 1999 12:16:56 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6])
+	by hub.org (8.9.2/8.9.1) with ESMTP id MAA94469
+	for <pgsql-hackers@postgreSQL.org>; Tue, 23 Mar 1999 12:11:33 -0500 (EST)
+	(envelope-from tgl@sss.pgh.pa.us)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id MAA24474;
+	Tue, 23 Mar 1999 12:09:52 -0500 (EST)
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+cc: riedel+@CMU.EDU, pgsql-hackers@postgreSQL.org
+Subject: Re: [HACKERS] optimizer and type question 
+In-reply-to: Your message of Mon, 22 Mar 1999 21:25:45 -0500 (EST) 
+             <199903230225.VAA01641@candle.pha.pa.us> 
+Date: Tue, 23 Mar 1999 12:09:52 -0500
+Message-ID: <24471.922208992@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Bruce Momjian <maillist@candle.pha.pa.us> writes:
+> What we really need is some way to determine how far the requested value
+> is from the min/max values.  With int, we just do (val-min)/(max-min). 
+> That works, but how do we do that for types that don't support division.
+> Strings come to mind in this case.
+
+What I'm envisioning is that we still apply the (val-min)/(max-min)
+logic, but apply it to numeric values that are produced in a
+type-dependent way.
+
+For ints and floats the conversion is trivial, of course.
+
+For strings, the first thing that comes to mind is to return 0 for a
+null string and the value of the first byte for a non-null string.
+This would give you one-part-in-256 selectivity which is plenty good
+enough for what the selectivity code needs to do.  (Actually, it's
+only that good if the strings' first bytes are pretty well spread out.
+If you have a table containing English words, for example, you might
+only get about one part in 26 this way, since the first bytes will
+probably only run from A to Z.  Might be better to use the first two
+characters of the string to compute the selectivity representation.)
+
+In general, you can apply this logic as long as you can come up with
+some numerical approximation to the data type's sorting order.  It
+doesn't have to be exact.
+
+			regards, tom lane
+
+
--- a/doc/TODO.detail/outer
+++ b/doc/TODO.detail/outer
+From lockhart@alumni.caltech.edu Thu Jan  7 13:31:08 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA07771
+	for <maillist@candle.pha.pa.us>; Thu, 7 Jan 1999 13:31:06 -0500 (EST)
+Received: from golem.jpl.nasa.gov (IDENT:root@hectic-2.jpl.nasa.gov [128.149.68.204]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id NAA14597 for <maillist@candle.pha.pa.us>; Thu, 7 Jan 1999 13:27:37 -0500 (EST)
+Received: from alumni.caltech.edu (localhost [127.0.0.1])
+	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id SAA13416;
+	Thu, 7 Jan 1999 18:26:56 GMT
+Sender: tgl@mythos.jpl.nasa.gov
+Message-ID: <3694FC70.FAD67BC3@alumni.caltech.edu>
+Date: Thu, 07 Jan 1999 18:26:56 +0000
+From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>
+Organization: Caltech/JPL
+X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.30 i686)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+CC: Postgres Hackers List <hackers@postgresql.org>
+Subject: Outer Joins (and need CASE help)
+References: <199901071747.MAA07054@candle.pha.pa.us>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: RO
+
+> Thomas, do you need help on outer joins?
+
+Yes. I'm going slowly partly because I get distracted with other
+Postgres stuff like docs, and partly because I don't understand all of
+the pieces I'm working with.
+
+I've identified the place in the MergeJoin code where the null filling
+for outer joins needs to happen, and have the "merge walk" code done.
+But I don't have the supporting code which actually would know how to
+null-fill a result tuple from the left or right. I thought you might be
+interested in that?
+
+I've done some work in the parser, and can now do things like:
+
+postgres=> select * from t1 join t2 using (i);
+NOTICE:  JOIN not yet implemented
+i|j|i|k
+-+-+-+-
+1|2|1|3
+(1 row)
+
+But this is just an inner join, and the result isn't quite right since
+the second "i" column should probably be omitted. At the moment I
+transform it from the syntax above into existing parse nodes, and
+everything from there on works.
+
+I don't yet pass an explicit join node into the planner/optimizer, and
+that will be the hardest part I assume. Perhaps we can work on that
+together.
+
+So, what I'll try to do (soon, in the next few days?) is put in
+
+  #ifdef ENABLE_OUTER_JOINS
+
+conditional code into the parser area (already there for the executor)
+and commit everything to the development tree. Does that sound OK?
+
+Oh, and if anyone is looking for something to do, I've got a couple of
+CASE statements in the case.sql regression test which are commented out
+because they crash the backend. They involve references to multiple
+tables within a single result column, and in other contexts that
+construct works. It would be great if someone had time to track it
+down...
+
+                     - Tom
+
+From lockhart@alumni.caltech.edu Mon Feb 22 02:01:13 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA22073
+	for <maillist@candle.pha.pa.us>; Mon, 22 Feb 1999 02:01:12 -0500 (EST)
+Received: from golem.jpl.nasa.gov (IDENT:root@hectic-2.jpl.nasa.gov [128.149.68.204]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id BAA26054 for <maillist@candle.pha.pa.us>; Mon, 22 Feb 1999 01:57:00 -0500 (EST)
+Received: from alumni.caltech.edu (localhost [127.0.0.1])
+	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id GAA04715;
+	Mon, 22 Feb 1999 06:56:36 GMT
+Sender: tgl@mythos.jpl.nasa.gov
+Message-ID: <36D0FFA4.32ADB75C@alumni.caltech.edu>
+Date: Mon, 22 Feb 1999 06:56:36 +0000
+From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>
+Organization: Caltech/JPL
+X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+CC: hackers@postgreSQL.org
+Subject: Re: start on outer join
+References: <199902220304.WAA10066@candle.pha.pa.us>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: ROr
+
+Bruce Momjian wrote:
+> 
+> > Will apply ... some other changes laying a bit of
+> > groundwork for outer joins so you can start on the planner/optimizer
+> > parts :)
+> Those will be a synch now that I understand the optimizer.  In fact, I
+> think it all will happen in the executor.
+
+I've modified executor/nodeMergeJoin.c to walk a left/right/both outer
+join, but didn't fill in the part which actually creates the result
+tuple (which will be the current left- or right-side tuple plus nulls
+for filler). I hope this is up your alley :)
+
+So far, I'm not certain what to pass to the planner. The syntax leads me
+to pass a select structure from gram.y with a "JoinExpr" structure in
+the "fromClause" list. I need to expand that with a combination of
+column names and qualifications, but at the time I see the JoinExpr I
+don't have access to the top query structure itself. So I may just keep
+a modestly transformed JoinExpr to expand later or to pass to the
+planner.
+
+btw, the EXCEPT/INTERSECT stuff from Stefan has some ugliness in gram.y
+which needs to be fixed (the shift/reduce conflict is not acceptable for
+our release version) and some of that code clearly needs to move to
+analyze.c or some other module.
+
+                     - Tom
+
+From maillist Wed Feb 24 05:27:08 1999
+Received: (from maillist@localhost)
+	by candle.pha.pa.us (8.9.0/8.9.0) id FAA09648;
+	Wed, 24 Feb 1999 05:27:08 -0500 (EST)
+From: Bruce Momjian <maillist>
+Message-Id: <199902241027.FAA09648@candle.pha.pa.us>
+Subject: Re: [HACKERS] OUTER joins
+In-Reply-To: <199902240953.EAA08561@candle.pha.pa.us> from Bruce Momjian at "Feb 24, 1999  4:53:21 am"
+To: maillist@candle.pha.pa.us (Bruce Momjian)
+Date: Wed, 24 Feb 1999 05:27:07 -0500 (EST)
+Cc: lockhart@alumni.caltech.edu, hackers@postgreSQL.org
+X-Mailer: ELM [version 2.4ME+ PL47 (25)]
+MIME-Version: 1.0
+Content-Type: text/plain; charset=US-ASCII
+Content-Transfer-Encoding: 7bit
+Status: RO
+
+> 
+> How do you propose doing outer joins in non-mergejoin situations?
+> Mergejoins can only be used currently in equal joins.
+
+Is your solution going to be to make sure the OUTER table is always a
+MergeJoin, or on the outside of a join loop?  That could work.
+
+That could get tricky if the table is joined to _two_ other tables. 
+With the cleaned-up optimizer, we can disable non-merge joins in certain
+circumstances, and prevent OUTER tables from being inner in the others. 
+Is that the plan?
+
+-- 
+  Bruce Momjian                        |  http://www.op.net/~candle
+  maillist@candle.pha.pa.us            |  (610) 853-3000
+  +  If your life is a hard drive,     |  830 Blythe Avenue
+  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
+
+From lockhart@alumni.caltech.edu Mon Mar  1 13:01:08 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA21672
+	for <maillist@candle.pha.pa.us>; Mon, 1 Mar 1999 13:01:06 -0500 (EST)
+Received: from golem.jpl.nasa.gov (IDENT:root@hectic-2.jpl.nasa.gov [128.149.68.204]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id MAA12756 for <maillist@candle.pha.pa.us>; Mon, 1 Mar 1999 12:14:16 -0500 (EST)
+Received: from alumni.caltech.edu (localhost [127.0.0.1])
+	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id RAA09406;
+	Mon, 1 Mar 1999 17:10:49 GMT
+Sender: tgl@mythos.jpl.nasa.gov
+Message-ID: <36DACA19.E6DBE7D8@alumni.caltech.edu>
+Date: Mon, 01 Mar 1999 17:10:49 +0000
+From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>
+Organization: Caltech/JPL
+X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+CC: PostgreSQL-development <hackers@postgreSQL.org>
+Subject: Re: OUTER joins
+References: <199902240953.EAA08561@candle.pha.pa.us>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: ROr
+
+(back from a short vacation...)
+
+> How do you propose doing outer joins in non-mergejoin situations?
+> Mergejoins can only be used currently in equal joins.
+
+Hadn't thought about it, other than figuring that implementing the
+equi-join first was a good start. There is a class of outer join syntax
+(the USING clause) which is implicitly an equi-join...
+
+                        - Tom
+
+From lockhart@alumni.caltech.edu Mon Mar  8 21:55:02 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA15978
+	for <maillist@candle.pha.pa.us>; Mon, 8 Mar 1999 21:54:57 -0500 (EST)
+Received: from golem.jpl.nasa.gov (IDENT:root@hectic-1.jpl.nasa.gov [128.149.68.203]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id VAA15837 for <maillist@candle.pha.pa.us>; Mon, 8 Mar 1999 21:48:33 -0500 (EST)
+Received: from alumni.caltech.edu (localhost [127.0.0.1])
+	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id CAA06996;
+	Tue, 9 Mar 1999 02:46:40 GMT
+Sender: tgl@mythos.jpl.nasa.gov
+Message-ID: <36E48B90.F3E902B7@alumni.caltech.edu>
+Date: Tue, 09 Mar 1999 02:46:40 +0000
+From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>
+Organization: Caltech/JPL
+X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+CC: hackers@postgreSQL.org
+Subject: Re: OUTER joins
+References: <199903070325.WAA10357@candle.pha.pa.us>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: ROr
+
+> > Hadn't thought about it, other than figuring that implementing the
+> > equi-join first was a good start. There is a class of outer join 
+> > syntax (the USING clause) which is implicitly an equi-join...
+> Not that easy.  You don't automatically get a mergejoin from an
+> equijoin.  I will have to force outer's to be either mergejoins, or
+> inners of non-merge joins.  Can you add code to non-merge joins in the
+> executor to throw out a null row if it does not find an inner match 
+> for the outer row, and I will handle the optimizer so it doesn't throw 
+> a non-conforming plan to the executor.
+
+So far I don't have enough info in the parser to get the
+planner/optimizer going. Should we work from the front to the back, or
+should I go ahead and look at the non-merge joins? It's painfully
+obvious that I don't know anything about the middle parts of this to
+proceed without lots more research.
+
+                        - Tom
+
+From lockhart@alumni.caltech.edu Tue Mar  9 22:47:57 1999
+Received: from golem.jpl.nasa.gov (IDENT:root@hectic-1.jpl.nasa.gov [128.149.68.203])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07869
+	for <maillist@candle.pha.pa.us>; Tue, 9 Mar 1999 22:47:54 -0500 (EST)
+Received: from alumni.caltech.edu (localhost [127.0.0.1])
+	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id DAA14761;
+	Wed, 10 Mar 1999 03:46:43 GMT
+Sender: tgl@mythos.jpl.nasa.gov
+Message-ID: <36E5EB23.F5CD959B@alumni.caltech.edu>
+Date: Wed, 10 Mar 1999 03:46:43 +0000
+From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>
+Organization: Caltech/JPL
+X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>, tgl@mythos.jpl.nasa.gov
+Subject: Re: SQL outer
+References: <199903100112.UAA05772@candle.pha.pa.us>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: RO
+
+>         select  *
+>         from    outer tab1, tab2, tab3
+>         where   tab1.col1 = tab2.col1 and
+>                 tab1.col1 = tab3.col1
+
+select *
+from t1 left join t2 using (c1)
+        join t3 on (c1 = t3.c1)
+
+Result:
+t1.c1	t1.c2	t2.c2	t3.c1
+2	12	NULL	32
+
+t1:
+c1	c2
+1	11
+2	12
+3	13
+4	14
+
+t2:
+c1	c2
+1	21
+3	23
+
+t3:
+c1	c2
+2	32
+
+From lockhart@alumni.caltech.edu Wed Mar 10 10:48:54 1999
+Received: from golem.jpl.nasa.gov (IDENT:root@hectic-1.jpl.nasa.gov [128.149.68.203])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA16741
+	for <maillist@candle.pha.pa.us>; Wed, 10 Mar 1999 10:48:51 -0500 (EST)
+Received: from alumni.caltech.edu (localhost [127.0.0.1])
+	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id PAA17723;
+	Wed, 10 Mar 1999 15:48:31 GMT
+Sender: tgl@mythos.jpl.nasa.gov
+Message-ID: <36E6944F.1F93B08@alumni.caltech.edu>
+Date: Wed, 10 Mar 1999 15:48:31 +0000
+From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>
+Organization: Caltech/JPL
+X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+CC: Thomas Lockhart <lockhart@alumni.caltech.edu>
+Subject: Re: SQL outer
+References: <199903100112.UAA05772@candle.pha.pa.us> <36E5EB23.F5CD959B@alumni.caltech.edu>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: ROr
+
+Just thinking...
+
+If the initial RelOptInfo groupings are derived from the WHERE clause
+expressions, how about marking the "outer" property in those expressions
+in the parser? istm that is where the parser knows about two tables in
+one place, and I'm generating those expressions anyway. We could add a
+field(s) to the expression structure, or pass along a slightly different
+structure...
+
+                         - Tom
+
--- a/doc/TODO.detail/performance
+++ b/doc/TODO.detail/performance
+From owner-pgsql-hackers@hub.org Sun Jun 14 18:45:04 1998
+Received: from hub.org (hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id SAA03690
+	for <maillist@candle.pha.pa.us>; Sun, 14 Jun 1998 18:45:00 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id SAA28049; Sun, 14 Jun 1998 18:39:42 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 14 Jun 1998 18:36:06 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id SAA27943 for pgsql-hackers-outgoing; Sun, 14 Jun 1998 18:36:04 -0400 (EDT)
+Received: from angular.illustra.com (ifmxoak.illustra.com [206.175.10.34]) by hub.org (8.8.8/8.7.5) with ESMTP id SAA27925 for <pgsql-hackers@postgresql.org>; Sun, 14 Jun 1998 18:35:47 -0400 (EDT)
+Received: from hawk.illustra.com (hawk.illustra.com [158.58.61.70]) by angular.illustra.com (8.7.4/8.7.3) with SMTP id PAA21293 for <pgsql-hackers@postgresql.org>; Sun, 14 Jun 1998 15:35:12 -0700 (PDT)
+Received: by hawk.illustra.com (5.x/smail2.5/06-10-94/S)
+	id AA07922; Sun, 14 Jun 1998 15:35:13 -0700
+From: dg@illustra.com (David Gould)
+Message-Id: <9806142235.AA07922@hawk.illustra.com>
+Subject: [HACKERS] performance tests, initial results
+To: pgsql-hackers@postgreSQL.org
+Date: Sun, 14 Jun 1998 15:35:13 -0700 (PDT)
+Mime-Version: 1.0
+Content-Type: text/plain; charset=US-ASCII
+Content-Transfer-Encoding: 7bit
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: RO
+
+
+I have been playing a little with the performance tests found in
+pgsql/src/tests/performance and have a few observations that might be of
+minor interest.
+
+The tests themselves are simple enough although the result parsing in the
+driver did not work on Linux. I am enclosing a patch below to fix this. I
+think it will also work better on the other systems.
+
+A summary of results from my testing are below. Details are at the bottom
+of this message.
+
+My test system is 'leslie':
+
+ linux 2.0.32, gcc version 2.7.2.3
+ P133, HX chipset, 512K L2, 32MB mem
+ NCR810 fast scsi, Quantum Atlas 2GB drive (7200 rpm).
+
+
+                     Results Summary (times in seconds)
+
+                    Single txn 8K txn    Create 8K idx 8K random Simple
+Case Description    8K insert  8K insert Index  Insert Scans     Orderby
+=================== ========== ========= ====== ====== ========= =======
+1 From Distribution
+  P90 FreeBsd -B256      39.56   1190.98   3.69  46.65     65.49    2.27
+  IDE
+
+2 Running on leslie
+  P133 Linux 2.0.32      15.48    326.75   2.99  20.69     35.81    1.68
+  SCSI 32M
+
+3 leslie, -o -F
+  no forced writes       15.90     24.98   2.63  20.46     36.43    1.69
+
+4 leslie, -o -F
+  no ASSERTS             14.92     23.23   1.38  18.67     33.79    1.58
+
+5 leslie, -o -F -B2048
+  more buffers           21.31     42.28   2.65  25.74     42.26    1.72
+
+6 leslie, -o -F -B2048
+  more bufs, no ASSERT   20.52     39.79   1.40  24.77     39.51    1.55
+
+
+
+
+                 Case to Case Difference Factors (+ is faster)
+
+                    Single txn 8K txn    Create 8K idx 8K random Simple
+Case Description    8K insert  8K insert Index  Insert Scans     Orderby
+=================== ========== ========= ====== ====== ========= =======
+
+leslie vs BSD P90.        2.56      3.65   1.23   2.25      1.83    1.35
+
+(noflush -F) vs no -F    -1.03     13.08   1.14   1.01     -1.02    1.00
+
+No Assert vs Assert       1.05      1.07   1.90   1.06      1.07    1.09
+
+-B256 vs -B2048           1.34      1.69   1.01   1.26      1.16    1.02
+
+
+Observations:
+
+ - leslie (P133 linux) appears to be about 1.8 times faster than the
+   P90 BSD system used for the test result distributed with the source, not
+   counting the 8K txn insert case which was completely disk bound.
+
+ - SCSI disks make a big (factor of 3.6) difference. During this test the
+   disk was hammering and cpu utilization was < 10%.
+
+ - Assertion checking seems to cost about 7% except for create index where
+   it costs 90%
+
+ - the -F option to avoid flushing buffers has tremendous effect if there are
+   many very small transactions. Or, another way, flushing at the end of the
+   transaction is a major disaster for performance.
+
+ - Something is very wrong with our buffer cache implementation. Going from
+   256 buffers to 2048 buffers costs an average of 25%. In the 8K txn case
+   it costs about 70%. I see looking at the code and profiling that in the 8K
+   txn case this is in BufferSync() which examines all the buffers at commit
+   time. I don't quite understand why it is so costly for the single 8K row
+   txn (35%) though.
+
+It would be nice to have some more tests. Maybe the Wisconsin stuff will
+be useful.
+
+
+
+----------------- patch to test harness. apply from pgsql ------------
+*** src/test/performance/runtests.pl.orig	Sun Jun 14 11:34:04 1998
+
+Differences %
+
+
+----------------- patch to test harness. apply from pgsql ------------
+*** src/test/performance/runtests.pl.orig	Sun Jun 14 11:34:04 1998
+--- src/test/performance/runtests.pl	Sun Jun 14 12:07:30 1998
+***************
+*** 84,123 ****
+  open (STDERR, ">$TmpFile") or die;
+  select (STDERR); $| = 1;
+  
+! for ($i = 0; $i <= $#perftests; $i++)
+! {
+  	$test = $perftests[$i];
+  	($test, $XACTBLOCK) = split (/ /, $test);
+  	$runtest = $test;
+! 	if ( $test =~ /\.ntm/ )
+! 	{
+! 		# 
+  		# No timing for this queries
+- 		# 
+  		close (STDERR);		# close $TmpFile
+  		open (STDERR, ">/dev/null") or die;
+  		$runtest =~ s/\.ntm//;
+  	}
+! 	else
+! 	{
+  		close (STDOUT);
+  		open(STDOUT, ">&SAVEOUT");
+  		print STDOUT "\nRunning: $perftests[$i+1] ...";
+  		close (STDOUT);
+  		open (STDOUT, ">/dev/null") or die;
+  		select (STDERR); $| = 1;
+! 		printf "$perftests[$i+1]: ";
+  	}
+  
+  	do "sqls/$runtest";
+  
+  	# Restore STDERR to $TmpFile
+! 	if ( $test =~ /\.ntm/ )
+! 	{
+  		close (STDERR);
+  		open (STDERR, ">>$TmpFile") or die;
+  	}
+- 
+  	select (STDERR); $| = 1;
+  	$i++;
+  }
+--- 84,116 ----
+  open (STDERR, ">$TmpFile") or die;
+  select (STDERR); $| = 1;
+  
+! for ($i = 0; $i <= $#perftests; $i++) {
+  	$test = $perftests[$i];
+  	($test, $XACTBLOCK) = split (/ /, $test);
+  	$runtest = $test;
+! 	if ( $test =~ /\.ntm/ ) {
+  		# No timing for this queries
+  		close (STDERR);		# close $TmpFile
+  		open (STDERR, ">/dev/null") or die;
+  		$runtest =~ s/\.ntm//;
+  	}
+! 	else {
+  		close (STDOUT);
+  		open(STDOUT, ">&SAVEOUT");
+  		print STDOUT "\nRunning: $perftests[$i+1] ...";
+  		close (STDOUT);
+  		open (STDOUT, ">/dev/null") or die;
+  		select (STDERR); $| = 1;
+! 		print "$perftests[$i+1]: ";
+  	}
+  
+  	do "sqls/$runtest";
+  
+  	# Restore STDERR to $TmpFile
+! 	if ( $test =~ /\.ntm/ ) {
+  		close (STDERR);
+  		open (STDERR, ">>$TmpFile") or die;
+  	}
+  	select (STDERR); $| = 1;
+  	$i++;
+  }
+***************
+*** 128,138 ****
+  open (TMPF, "<$TmpFile") or die;
+  open (RESF, ">$ResFile") or die;
+  
+! while (<TMPF>)
+! {
+! 	$str = $_;
+! 	($test, $rtime) = split (/:/, $str);
+! 	($tmp, $rtime, $rest) = split (/[ 	]+/, $rtime);
+! 	print RESF "$test: $rtime\n";
+  }
+  
+--- 121,130 ----
+  open (TMPF, "<$TmpFile") or die;
+  open (RESF, ">$ResFile") or die;
+  
+! while (<TMPF>) {
+!         if (m/^(.*: ).* ([0-9:.]+) *elapsed/) {
+! 	    ($test, $rtime) = ($1, $2);
+! 	     print RESF $test, $rtime, "\n";
+!         }
+  }
+
+------------------------------------------------------------------------
+
+  
+------------------------- testcase detail --------------------------
+   
+1. from distribution
+   DBMS:		PostgreSQL 6.2b10
+   OS:		FreeBSD 2.1.5-RELEASE
+   HardWare:	i586/90, 24M RAM, IDE
+   StartUp:	postmaster -B 256 '-o -S 2048' -S
+   Compiler:	gcc 2.6.3
+   Compiled:	-O, without CASSERT checking, with
+   		-DTBL_FREE_CMD_MEMORY (to free memory
+   		if BEGIN/END after each query execution)
+   DB connection startup: 0.20
+   8192 INSERTs INTO SIMPLE (1 xact): 39.58
+   8192 INSERTs INTO SIMPLE (8192 xacts): 1190.98
+   Create INDEX on SIMPLE: 3.69
+   8192 INSERTs INTO SIMPLE with INDEX (1 xact): 46.65
+   8192 random INDEX scans on SIMPLE (1 xact): 65.49
+   ORDER BY SIMPLE: 2.27
+   
+   
+2. run on leslie with asserts
+   DBMS:		PostgreSQL 6.3.2 (plus changes to 98/06/01)
+   OS:		Linux 2.0.32 leslie
+   HardWare:	i586/133 HX 512, 32M RAM, fast SCSI, 7200rpm
+   StartUp:	postmaster -B 256 '-o -S 2048' -S
+   Compiler:	gcc 2.7.2.3
+   Compiled:	-O, WITH CASSERT checking, with
+   		-DTBL_FREE_CMD_MEMORY (to free memory
+   		if BEGIN/END after each query execution)
+   DB connection startup: 0.10
+   8192 INSERTs INTO SIMPLE (1 xact): 15.48
+   8192 INSERTs INTO SIMPLE (8192 xacts): 326.75
+   Create INDEX on SIMPLE: 2.99
+   8192 INSERTs INTO SIMPLE with INDEX (1 xact): 20.69
+   8192 random INDEX scans on SIMPLE (1 xact): 35.81
+   ORDER BY SIMPLE: 1.68
+   
+   
+3. with -F to avoid forced i/o
+   DBMS:		PostgreSQL 6.3.2 (plus changes to 98/06/01)
+   OS:		Linux 2.0.32 leslie
+   HardWare:	i586/133 HX 512, 32M RAM, fast SCSI, 7200rpm
+   StartUp:	postmaster -B 256 '-o -S 2048 -F' -S
+   Compiler:	gcc 2.7.2.3
+   Compiled:	-O, WITH CASSERT checking, with
+   		-DTBL_FREE_CMD_MEMORY (to free memory
+   		if BEGIN/END after each query execution)
+   DB connection startup: 0.10
+   8192 INSERTs INTO SIMPLE (1 xact): 15.90
+   8192 INSERTs INTO SIMPLE (8192 xacts): 24.98
+   Create INDEX on SIMPLE: 2.63
+   8192 INSERTs INTO SIMPLE with INDEX (1 xact): 20.46
+   8192 random INDEX scans on SIMPLE (1 xact): 36.43
+   ORDER BY SIMPLE: 1.69
+   
+   
+4. no asserts, -F to avoid forced I/O
+   DBMS:		PostgreSQL 6.3.2 (plus changes to 98/06/01)
+   OS:		Linux 2.0.32 leslie
+   HardWare:	i586/133 HX 512, 32M RAM, fast SCSI, 7200rpm
+   StartUp:	postmaster -B 256 '-o -S 2048' -S
+   Compiler:	gcc 2.7.2.3
+   Compiled:	-O, No CASSERT checking, with
+   		-DTBL_FREE_CMD_MEMORY (to free memory
+   		if BEGIN/END after each query execution)
+   DB connection startup: 0.10
+   8192 INSERTs INTO SIMPLE (1 xact): 14.92
+   8192 INSERTs INTO SIMPLE (8192 xacts): 23.23
+   Create INDEX on SIMPLE: 1.38
+   8192 INSERTs INTO SIMPLE with INDEX (1 xact): 18.67
+   8192 random INDEX scans on SIMPLE (1 xact): 33.79
+   ORDER BY SIMPLE: 1.58
+   
+   
+5. with more buffers (2048 vs 256) and -F to avoid forced i/o
+   DBMS:		PostgreSQL 6.3.2 (plus changes to 98/06/01)
+   OS:		Linux 2.0.32 leslie
+   HardWare:	i586/133 HX 512, 32M RAM, fast SCSI, 7200rpm
+   StartUp:	postmaster -B 2048 '-o -S 2048 -F' -S
+   Compiler:	gcc 2.7.2.3
+   Compiled:	-O, WITH CASSERT checking, with
+   		-DTBL_FREE_CMD_MEMORY (to free memory
+   		if BEGIN/END after each query execution)
+   DB connection startup: 0.11
+   8192 INSERTs INTO SIMPLE (1 xact): 21.31
+   8192 INSERTs INTO SIMPLE (8192 xacts): 42.28
+   Create INDEX on SIMPLE: 2.65
+   8192 INSERTs INTO SIMPLE with INDEX (1 xact): 25.74
+   8192 random INDEX scans on SIMPLE (1 xact): 42.26
+   ORDER BY SIMPLE: 1.72
+   
+   
+6. No Asserts, more buffers (2048 vs 256) and -F to avoid forced i/o
+   DBMS:		PostgreSQL 6.3.2 (plus changes to 98/06/01)
+   OS:		Linux 2.0.32 leslie
+   HardWare:	i586/133 HX 512, 32M RAM, fast SCSI, 7200rpm
+   StartUp:	postmaster -B 2048 '-o -S 2048 -F' -S
+   Compiler:	gcc 2.7.2.3
+   Compiled:	-O, No CASSERT checking, with
+   		-DTBL_FREE_CMD_MEMORY (to free memory
+   		if BEGIN/END after each query execution)
+   DB connection startup: 0.11
+   8192 INSERTs INTO SIMPLE (1 xact): 20.52
+   8192 INSERTs INTO SIMPLE (8192 xacts): 39.79
+   Create INDEX on SIMPLE: 1.40
+   8192 INSERTs INTO SIMPLE with INDEX (1 xact): 24.77
+   8192 random INDEX scans on SIMPLE (1 xact): 39.51
+   ORDER BY SIMPLE: 1.55
+---------------------------------------------------------------------
+
+-dg
+
+David Gould            dg@illustra.com           510.628.3783 or 510.305.9468 
+Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
+"Don't worry about people stealing your ideas.  If your ideas are any
+ good, you'll have to ram them down people's throats." -- Howard Aiken
+
+
--- a/doc/TODO.detail/persistent
+++ b/doc/TODO.detail/persistent
+From owner-pgsql-hackers@hub.org Mon May 11 11:31:09 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03006
+	for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:31:07 -0400 (EDT)
+Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id LAA01663 for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:24:42 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA21841; Mon, 11 May 1998 11:15:25 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:15:12 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA21683 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:15:09 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA21451 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:15:03 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA24915;
+	Mon, 11 May 1998 11:14:43 -0400 (EDT)
+To: Brett McCormick <brett@work.chicken.org>
+cc: hackers@postgreSQL.org
+Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh] 
+In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT) 
+             <13655.4384.345723.466046@abraxas.scene.com> 
+Date: Mon, 11 May 1998 11:14:43 -0400
+Message-ID: <24913.894899683@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: RO
+
+Brett McCormick <brett@work.chicken.org> writes:
+> same way that the current network socket is passed -- through an execv
+> argument.  hopefully, however, the non-execv()ing fork will be in 6.4.
+
+Um, you missed the point, Brett.  David was hoping to transfer a client
+connection from the postmaster to an *already existing* backend process.
+Fork, with or without exec, solves the problem for a backend that's
+started after the postmaster has accepted the client socket.
+
+This does lead to a different line of thought, however.  Pre-started
+backends would have access to the "master" connection socket on which
+the postmaster listens for client connections, right?  Suppose that we
+fire the postmaster as postmaster, and demote it to being simply a
+manufacturer of new backend processes as old ones get used up.  Have
+one of the idle backend processes be the one doing the accept() on the
+master socket.  Once it has a client connection, it performs the
+authentication handshake and then starts serving the client (or just
+quits if authentication fails).  Meanwhile the next idle backend process
+has executed accept() on the master socket and is waiting for the next
+client; and shortly the postmaster/factory/whateverwecallitnow notices
+that it needs to start another backend to add to the idle-backend pool.
+
+This'd probably need some interlocking among the backends.  I have no
+idea whether it'd be safe to have all the idle backends trying to
+do accept() on the master socket simultaneously, but it sounds risky.
+Better to use a mutex so that only one gets to do it while the others
+sleep.
+
+			regards, tom lane
+
+
+From owner-pgsql-hackers@hub.org Mon May 11 11:35:55 1998
+Received: from hub.org (hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03043
+	for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:35:53 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA23494; Mon, 11 May 1998 11:27:10 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:27:02 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA23473 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:27:01 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA23462 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:26:56 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA25006;
+	Mon, 11 May 1998 11:26:44 -0400 (EDT)
+To: Brett McCormick <brett@work.chicken.org>
+cc: hackers@postgreSQL.org
+Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh] 
+In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT) 
+             <13655.4384.345723.466046@abraxas.scene.com> 
+Date: Mon, 11 May 1998 11:26:44 -0400
+Message-ID: <25004.894900404@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: RO
+
+Meanwhile, *I* missed the point about Brett's second comment :-(
+
+Brett McCormick <brett@work.chicken.org> writes:
+> There will have to be some sort of arg parsing in any case,
+> considering that you can pass configurable arguments to the backend..
+
+If we do the sort of change David and I were just discussing, then the
+pre-spawned backend would become responsible for parsing and dealing
+with the PGOPTIONS portion of the client's connection request message.
+That's just part of shifting the authentication handshake code from
+postmaster to backend, so it shouldn't be too hard.
+
+BUT: the whole point is to be able to initialize the backend before it
+is connected to a client.  How much of the expensive backend startup
+work depends on having the client connection options available?
+Any work that needs to know the options will have to wait until after
+the client connects.  If that means most of the startup work can't
+happen in advance anyway, then we're out of luck; a pre-started backend
+won't save enough time to be worth the effort.  (Unless we are willing
+to eliminate or redefine the troublesome options...)
+
+			regards, tom lane
+
+
--- a/doc/TODO.detail/pg_shadow
+++ b/doc/TODO.detail/pg_shadow
+From owner-pgsql-hackers@hub.org Sun Aug  2 20:01:13 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id UAA15937
+	for <maillist@candle.pha.pa.us>; Sun, 2 Aug 1998 20:01:11 -0400 (EDT)
+Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id TAA01026 for <maillist@candle.pha.pa.us>; Sun, 2 Aug 1998 19:33:53 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id TAA19878; Sun, 2 Aug 1998 19:30:59 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 02 Aug 1998 19:28:23 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id TAA19534 for pgsql-hackers-outgoing; Sun, 2 Aug 1998 19:28:22 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id TAA19521 for <pgsql-hackers@postgreSQL.org>; Sun, 2 Aug 1998 19:28:15 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id TAA22594
+	for <pgsql-hackers@postgreSQL.org>; Sun, 2 Aug 1998 19:28:13 -0400 (EDT)
+To: pgsql-hackers@postgreSQL.org
+Subject: [HACKERS] TODO item: make pg_shadow updates more robust
+Date: Sun, 02 Aug 1998 19:28:13 -0400
+Message-ID: <22591.902100493@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: ROr
+
+I learned the hard way last night that the postmaster's password
+authentication routines don't look at the pg_shadow table.  They
+look at a separate file named pg_pwd, which certain backend operations
+will update from pg_shadow.  (This is not documented in any user
+documentation that I could find; I had to burrow into
+src/backend/commands/user.c to discover it.)
+
+Unfortunately, if a clueless dbadmin (like me ;-)) tries to update
+password data with the obvious thing,
+	update pg_shadow set passwd = 'xxxxx' where usename = 'yyyy';
+pg_pwd doesn't get fixed.
+
+A more drastic problem is that pg_dump believes it can save and
+restore pg_shadow data using "copy".  Following an initdb and restore
+from a pg_dump -z script, pg_shadow will look just fine, but only
+the database admin will be listed in pg_pwd.  This is likely to provoke
+some confusion, IMHO.
+
+As a short-term thing, the fact that you *must* set passwords with
+ALTER USER ought to be documented, preferably someplace where a
+dbadmin who's never heard of ALTER USER is likely to find it.
+
+As a longer-term thing, I think it would be far better if ordinary
+SQL operations on pg_shadow just did the right thing.  Wouldn't it
+be possible to implement copying to pg_pwd by means of a trigger on
+pg_shadow updates, or something like that?
+
+(I'm afraid that pg_dump -z is pretty well broken for operations on
+a password-protected database, btw.  Has anyone used it successfully
+in that situation?)
+
+			regards, tom lane
+
+
--- a/doc/TODO.detail/prepare
+++ b/doc/TODO.detail/prepare
+From owner-pgsql-hackers@hub.org Wed Nov 18 14:40:49 1998
+Received: from hub.org (majordom@hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA29743
+	for <maillist@candle.pha.pa.us>; Wed, 18 Nov 1998 14:40:36 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.1/8.9.1) with SMTP id OAA03716;
+	Wed, 18 Nov 1998 14:37:04 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 18 Nov 1998 14:34:39 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.1/8.9.1) id OAA03395
+	for pgsql-hackers-outgoing; Wed, 18 Nov 1998 14:34:37 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8])
+	by hub.org (8.9.1/8.9.1) with SMTP id OAA03381
+	for <pgsql-hackers@hub.org>; Wed, 18 Nov 1998 14:34:31 -0500 (EST)
+	(envelope-from wieck@sapserv.debis.de)
+Received: by orion.SAPserv.Hamburg.dsh.de 
+	for pgsql-hackers@hub.org 
+	id m0zgDnj-000EBTC; Wed, 18 Nov 98 21:02 MET
+Message-Id: <m0zgDnj-000EBTC@orion.SAPserv.Hamburg.dsh.de>
+From: jwieck@debis.com (Jan Wieck)
+Subject: Re: [HACKERS] PREPARE
+To: meskes@usa.net (Michael Meskes)
+Date: Wed, 18 Nov 1998 21:02:06 +0100 (MET)
+Cc: pgsql-hackers@hub.org
+Reply-To: jwieck@debis.com (Jan Wieck)
+In-Reply-To: <19981118084843.B869@usa.net> from "Michael Meskes" at Nov 18, 98 08:48:43 am
+X-Mailer: ELM [version 2.4 PL25]
+Content-Type: text
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Michael Meskes wrote:
+
+>
+> On Wed, Nov 18, 1998 at 03:23:30AM +0000, Thomas G. Lockhart wrote:
+> > > I didn't get this one completly. What input do you mean?
+> >
+> > Just the original string/query to be prepared...
+>
+> I see. But wouldn't it be more useful to preprocess the query and store the
+> resulting nodes instead? We don't want to parse the statement everytime a
+> variable binding comes in.
+
+    Right.  A real improvement would only be to have the prepared
+    execution plan in the backend and just giving  the  parameter
+    values.
+
+    I can think of the following construct:
+
+        PREPARE optimizable-statement;
+
+    That one will run parser/rewrite/planner, create a new memory
+    context with a unique identifier and  saves  the  querytree's
+    and  plan's  in  it.   Parameter values are identified by the
+    usual $n notation. The command returns the identifier.
+
+        EXECUTE QUERY identifier [value [, ...]];
+
+    then get's back the prepared plan and querytree  by  the  id,
+    creates  an  executor  context  with  the given values in the
+    parameter array and calls ExecutorRun() for them.
+
+    The PREPARE needs to analyze the resulting parsetrees to  get
+    the  datatypes  (and maybe atttypmod's) of the parameters, so
+    EXECUTE QUERY can convert the values into Datum's  using  the
+    types  input  functions.  And  the  EXECUTE has to be handled
+    special in tcop (it's something between a regular  query  and
+    an utility statement). But it's not too hard to implement.
+
+    Finally a
+
+        FORGET QUERY identifier;
+
+    (don't  remember  how  the  others  named it) will remove the
+    prepared plan etc. simply by destroying  the  memory  context
+    and dropping the identifier from the id->mcontext+prepareinfo
+    mapping.
+
+    This all  restricts  the  usage  of  PREPARE  to  optimizable
+    statements.  Is  it  required  to  be able to prepare utility
+    statements (like CREATE TABLE or so) too?
+
+
+Jan
+
+--
+
+#======================================================================#
+# It's easier to get forgiveness for being wrong than for being right. #
+# Let's break this rule - forgive me.                                  #
+#======================================== jwieck@debis.com (Jan Wieck) #
+
+
+
+
--- a/doc/TODO.detail/primary
+++ b/doc/TODO.detail/primary
+From owner-pgsql-hackers@hub.org Fri Sep  4 00:47:06 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id AAA01047
+	for <maillist@candle.pha.pa.us>; Fri, 4 Sep 1998 00:47:05 -0400 (EDT)
+Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id XAA02044 for <maillist@candle.pha.pa.us>; Thu, 3 Sep 1998 23:11:07 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id XAA27418; Thu, 3 Sep 1998 23:06:16 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 03 Sep 1998 23:04:11 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id XAA27185 for pgsql-hackers-outgoing; Thu, 3 Sep 1998 23:04:09 -0400 (EDT)
+Received: from dune.krs.ru (dune.krs.ru [195.161.16.38]) by hub.org (8.8.8/8.7.5) with ESMTP id XAA27169 for <hackers@postgreSQL.org>; Thu, 3 Sep 1998 23:03:59 -0400 (EDT)
+Received: from krs.ru (localhost.krs.ru [127.0.0.1])
+	by dune.krs.ru (8.8.8/8.8.8) with ESMTP id LAA10059;
+	Fri, 4 Sep 1998 11:03:00 +0800 (KRSS)
+	(envelope-from vadim@krs.ru)
+Message-ID: <35EF5864.E5142D35@krs.ru>
+Date: Fri, 04 Sep 1998 11:03:00 +0800
+From: Vadim Mikheev <vadim@krs.ru>
+Organization: OJSC Rostelecom (Krasnoyarsk)
+X-Mailer: Mozilla 4.05 [en] (X11; I; FreeBSD 2.2.6-RELEASE i386)
+MIME-Version: 1.0
+To: "D'Arcy J.M. Cain" <darcy@druid.net>
+CC: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>, hackers@postgreSQL.org
+Subject: Re: [HACKERS] Adding PRIMARY KEY info
+References: <m0zEaoV-00006JC@druid.net>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: RO
+
+D'Arcy J.M. Cain wrote:
+> 
+> Thus spake Vadim Mikheev
+> > Imho, indices should be used/created for FOREIGN keys and so pg_index
+> > is good place for both PRIMARY and FOREIGN keys infos.
+> 
+> Are you sure?  I don't know about implementing it but it seems more
+> like an attribute thing rather than an index thing.  Certainly from a
+> database design viewpoint you want to refer to the fields, not the
+> index on them.  If you put it into the index then you have to do
+> an extra join to get the information.
+> 
+> Perhaps you have to do the extra join anyway for other purposes so it
+> may not matter.  All I want is to be able to be able to extract the
+> field that the designer specified as the key.  As long as I can design
+> a select statement that gives me that I don't much care how it is
+> implemented.  I'll cache the information anyway so it won't have a
+> huge impact on my programs.
+
+First, let me note that you have to add int28 field to pg_class,
+not just oid field, to know what attributeS are in primary key
+(we support multi-attribute primary keys).
+This could be done...
+But what about foreign and unique (!) keys ?
+There may be _many_ foreign/unique keys defined for one table!
+And so foreign/unique keys info have to be stored somewhere else,
+not in pg_class.
+
+pg_index is good place for all _3_ key types because of:
+
+1. index should be created for each foreign key - 
+   just for performance.
+2. pg_index already has int28 field for key attributes.
+3. pg_index already has indisunique (note that foreign keys 
+   may reference unique keys, not just primary ones).
+
+- so we have just add two fields to pg_index:
+
+bool indisprimary;
+oid  indreferenced; 
+^^^^^^^^^^^^^^^^^^
+this is for foreign keys: oid of referenced relation' 
+primary/unique key index.
+
+I agreed that indices are just implementation...
+If you don't like to store key infos in pg_index then
+new pg_key relation have to be added...
+
+Comments ?
+
+Vadim
+
+
+From owner-pgsql-hackers@hub.org Sat Sep  5 02:01:13 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id CAA14437
+	for <maillist@candle.pha.pa.us>; Sat, 5 Sep 1998 02:01:11 -0400 (EDT)
+Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id BAA09928 for <maillist@candle.pha.pa.us>; Sat, 5 Sep 1998 01:48:32 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id BAA18282; Sat, 5 Sep 1998 01:43:16 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sat, 05 Sep 1998 01:41:40 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id BAA18241 for pgsql-hackers-outgoing; Sat, 5 Sep 1998 01:41:38 -0400 (EDT)
+Received: from dune.krs.ru (dune.krs.ru [195.161.16.38]) by hub.org (8.8.8/8.7.5) with ESMTP id BAA18211; Sat, 5 Sep 1998 01:41:21 -0400 (EDT)
+Received: from krs.ru (localhost.krs.ru [127.0.0.1])
+	by dune.krs.ru (8.8.8/8.8.8) with ESMTP id NAA20555;
+	Sat, 5 Sep 1998 13:40:44 +0800 (KRSS)
+	(envelope-from vadim@krs.ru)
+Message-ID: <35F0CEDB.AD721090@krs.ru>
+Date: Sat, 05 Sep 1998 13:40:43 +0800
+From: Vadim Mikheev <vadim@krs.ru>
+Organization: OJSC Rostelecom (Krasnoyarsk)
+X-Mailer: Mozilla 4.05 [en] (X11; I; FreeBSD 2.2.6-RELEASE i386)
+MIME-Version: 1.0
+To: "D'Arcy J.M. Cain" <darcy@druid.net>
+CC: hackers@postgreSQL.org, pgsql-core@postgreSQL.org
+Subject: Re: [HACKERS] Adding PRIMARY KEY info
+References: <m0zEvLK-00006FC@druid.net>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: ROr
+
+D'Arcy J.M. Cain wrote:
+> 
+> >
+> > pg_index is good place for all _3_ key types because of:
+> >
+> > 1. index should be created for each foreign key -
+> >    just for performance.
+> > 2. pg_index already has int28 field for key attributes.
+> > 3. pg_index already has indisunique (note that foreign keys
+> >    may reference unique keys, not just primary ones).
+> >
+> > - so we have just add two fields to pg_index:
+> >
+> > bool indisprimary;
+> > oid  indreferenced;
+> > ^^^^^^^^^^^^^^^^^^
+> > this is for foreign keys: oid of referenced relation'
+> > primary/unique key index.
+> 
+> Sounds fine to me.  Any chance of seeing this in 6.4?
+
+I could add this (and FOREIGN key implementation) before
+11-13 Sep... But not the ALTER TABLE ADD/DROP CONSTRAINT
+stuff (ok for Entry SQL).
+But we are in beta...
+
+Comments?
+
+> Nope, pg_index is fine by me.  Now, once we have this, how do we find
+> the index for a particular attribute?  I can't seem to figure out the
+> relationship between pg_attribute and pg_index.  The chart in the docs
+> suggests that indkey is the relation but I can't see any useful info
+> there for joining the tables.
+
+pg_index:
+	indrelid - oid of indexed relation
+	indkey   - up to the 8 attnums
+
+pg_attribute:
+	attrelid - oid of relation
+	attnum   - ...
+
+Without outer join you have to query pg_attribute for each
+valid attnum from pg_index->indkey -:(
+
+Vadim
+
+
--- a/doc/TODO.detail/tcl_arrays
+++ b/doc/TODO.detail/tcl_arrays
+From owner-pgsql-patches@hub.org Wed Oct 14 17:31:26 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA01594
+	for <maillist@candle.pha.pa.us>; Wed, 14 Oct 1998 17:31:24 -0400 (EDT)
+Received: from hub.org (majordom@hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id RAA01745 for <maillist@candle.pha.pa.us>; Wed, 14 Oct 1998 17:12:28 -0400 (EDT)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.8.8/8.8.8) with SMTP id RAA06607;
+	Wed, 14 Oct 1998 17:10:43 -0400 (EDT)
+	(envelope-from owner-pgsql-patches@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 14 Oct 1998 17:10:27 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.8.8/8.8.8) id RAA06562
+	for pgsql-patches-outgoing; Wed, 14 Oct 1998 17:10:26 -0400 (EDT)
+	(envelope-from owner-pgsql-patches@postgreSQL.org)
+X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-patches@postgreSQL.org using -f
+Received: from mambo.cs.unitn.it (mambo.cs.unitn.it [193.205.199.204])
+	by hub.org (8.8.8/8.8.8) with SMTP id RAA06494
+	for <pgsql-patches@postgreSQL.org>; Wed, 14 Oct 1998 17:10:01 -0400 (EDT)
+	(envelope-from dz@cs.unitn.it)
+Received: from nikita.wizard.net (ts-slip31.gelso.unitn.it [193.205.200.31]) by mambo.cs.unitn.it (8.6.12/8.6.12) with ESMTP id XAA20316 for <pgsql-patches@postgreSQL.org>; Wed, 14 Oct 1998 23:09:52 +0200
+Received: (from dz@localhost) by nikita.wizard.net (8.8.5/8.6.9) id WAA00489 for pgsql-patches@postgreSQL.org; Wed, 14 Oct 1998 22:56:58 +0200
+From: Massimo Dal Zotto <dz@cs.unitn.it>
+Message-Id: <199810142056.WAA00489@nikita.wizard.net>
+Subject: [PATCHES] TCL_ARRAYS
+To: pgsql-patches@postgreSQL.org (Pgsql Patches)
+Date: Wed, 14 Oct 1998 22:56:58 +0200 (MET DST)
+X-Mailer: ELM [version 2.4 PL24 ME4]
+MIME-Version: 1.0
+Content-Type: text/plain; charset=iso-8859-1
+Content-Transfer-Encoding: 8bit
+Sender: owner-pgsql-patches@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Hi,
+
+I have written this patch which fixes some problems with TCL_ARRAYS.
+The new array code uses a temporary buffer and is disabled by default
+because it depends on contrib/string-io which most of you don't use.
+This raises once again the problem of backslashes/escapes and various
+ambiguities in pgsql output. I hope this will be solved in 6.5.
+
+*** src/interfaces/libpgtcl/pgtclCmds.c.orig	Mon Sep 21 09:00:19 1998
+--- src/interfaces/libpgtcl/pgtclCmds.c	Wed Oct 14 15:32:21 1998
+***************
+*** 602,616 ****
+  		{
+  			for (i = 0; i < PQnfields(result); i++)
+  			{
+  				sprintf(nameBuffer, "%d,%.200s", tupno, PQfname(result, i));
+  				if (Tcl_SetVar2(interp, arrVar, nameBuffer,
+! #ifdef TCL_ARRAYS
+! 								tcl_value(PQgetvalue(result, tupno, i)),
+  #else
+  								PQgetvalue(result, tupno, i),
+- #endif
+  								TCL_LEAVE_ERR_MSG) == NULL)
+  					return TCL_ERROR;
+  			}
+  		}
+  		Tcl_AppendResult(interp, arrVar, 0);
+--- 602,624 ----
+  		{
+  			for (i = 0; i < PQnfields(result); i++)
+  			{
+ #ifdef TCL_ARRAYS
+ 				char *buff = strdup(PQgetvalue(result, tupno, i));
+  				sprintf(nameBuffer, "%d,%.200s", tupno, PQfname(result, i));
+  				if (Tcl_SetVar2(interp, arrVar, nameBuffer,
+! 								tcl_value(buff),
+! 								TCL_LEAVE_ERR_MSG) == NULL) {
+! 					free(buff);
+! 					return TCL_ERROR;
+! 				}
+! 				free(buff);
+  #else
+ 				sprintf(nameBuffer, "%d,%.200s", tupno, PQfname(result, i));
+ 				if (Tcl_SetVar2(interp, arrVar, nameBuffer,
+  								PQgetvalue(result, tupno, i),
+  								TCL_LEAVE_ERR_MSG) == NULL)
+  					return TCL_ERROR;
+ #endif
+  			}
+  		}
+  		Tcl_AppendResult(interp, arrVar, 0);
+***************
+*** 636,643 ****
+  		 */
+  		for (tupno = 0; tupno < PQntuples(result); tupno++)
+  		{
+  			const char *field0 = PQgetvalue(result, tupno, 0);
+! 			char * workspace = malloc(strlen(field0) + strlen(appendstr) + 210);
+  
+  			for (i = 1; i < PQnfields(result); i++)
+  			{
+--- 644,674 ----
+  		 */
+  		for (tupno = 0; tupno < PQntuples(result); tupno++)
+  		{
+ #ifdef TCL_ARRAYS
+ 			char *buff = strdup(PQgetvalue(result, tupno, 0));
+ 			const char *field0 = tcl_value(buff);
+ 			char *workspace = malloc(strlen(field0) + 210 + strlen(appendstr));
+ 
+ 			for (i = 1; i < PQnfields(result); i++)
+ 			{
+ 				free(buff);
+ 				buff = strdup(PQgetvalue(result, tupno, i));
+ 				sprintf(workspace, "%s,%.200s%s", field0, PQfname(result,i),
+ 						appendstr);
+ 				if (Tcl_SetVar2(interp, arrVar, workspace,
+ 								tcl_value(buff),
+ 								TCL_LEAVE_ERR_MSG) == NULL)
+ 				{
+ 					free(buff);
+ 					free(workspace);
+ 					return TCL_ERROR;
+ 				}
+ 			}
+ 			free(buff);
+ 			free(workspace);
+ #else
+  			const char *field0 = PQgetvalue(result, tupno, 0);
+! 			char *workspace = malloc(strlen(field0) + 210 + strlen(appendstr));
+  
+  			for (i = 1; i < PQnfields(result); i++)
+  			{
+***************
+*** 652,657 ****
+--- 683,689 ----
+  				}
+  			}
+  			free(workspace);
+ #endif
+  		}
+  		Tcl_AppendResult(interp, arrVar, 0);
+  		return TCL_OK;
+***************
+*** 669,676 ****
+--- 701,716 ----
+  			Tcl_AppendResult(interp, "argument to getTuple cannot exceed number of tuples - 1", 0);
+  			return TCL_ERROR;
+  		}
+ #ifdef TCL_ARRAYS
+ 		for (i = 0; i < PQnfields(result); i++) {
+ 			char *buff = strdup(PQgetvalue(result, tupno, i));
+ 			Tcl_AppendElement(interp, tcl_value(buff));
+ 			free(buff);
+ 		}
+ #else
+  		for (i = 0; i < PQnfields(result); i++)
+  			Tcl_AppendElement(interp, PQgetvalue(result, tupno, i));
+ #endif
+  		return TCL_OK;
+  	}
+  	else if (strcmp(opt, "-tupleArray") == 0)
+***************
+*** 688,697 ****
+--- 728,748 ----
+  		}
+  		for (i = 0; i < PQnfields(result); i++)
+  		{
+ #ifdef TCL_ARRAYS
+ 			char *buff = strdup(PQgetvalue(result, tupno, i));
+ 			if (Tcl_SetVar2(interp, argv[4], PQfname(result, i),
+ 							tcl_value(buff),
+ 							TCL_LEAVE_ERR_MSG) == NULL) {
+ 				free(buff);
+ 				return TCL_ERROR;
+ 			}
+ 			free(buff);
+ #else
+  			if (Tcl_SetVar2(interp, argv[4], PQfname(result, i),
+  							PQgetvalue(result, tupno, i),
+  							TCL_LEAVE_ERR_MSG) == NULL)
+  				return TCL_ERROR;
+ #endif
+  		}
+  		return TCL_OK;
+  	}
+***************
+*** 1303,1310 ****
+  		sprintf(buffer, "%d", tupno);
+  		Tcl_SetVar2(interp, argv[3], ".tupno", buffer, 0);
+  
+  		for (column = 0; column < ncols; column++)
+! 			Tcl_SetVar2(interp, argv[3], info[column].cname, PQgetvalue(result, tupno, column), 0);
+  
+  		Tcl_SetVar2(interp, argv[3], ".command", "update", 0);
+  
+--- 1354,1371 ----
+  		sprintf(buffer, "%d", tupno);
+  		Tcl_SetVar2(interp, argv[3], ".tupno", buffer, 0);
+  
+ #ifdef TCL_ARRAYS
+ 		for (column = 0; column < ncols; column++) {
+ 			char *buff = strdup(PQgetvalue(result, tupno, column));
+ 			Tcl_SetVar2(interp, argv[3], info[column].cname,
+ 						tcl_value(buff), 0);
+ 			free(buff);
+ 		}
+ #else
+  		for (column = 0; column < ncols; column++)
+! 			Tcl_SetVar2(interp, argv[3], info[column].cname,
+! 						PQgetvalue(result, tupno, column), 0);
+! #endif
+  
+  		Tcl_SetVar2(interp, argv[3], ".command", "update", 0);
+  
+*** src/include/config.h.in.orig	Wed Aug 26 09:01:16 1998
+--- src/include/config.h.in	Wed Oct 14 22:44:00 1998
+***************
+*** 312,318 ****
+   * of postgres C-like arrays, for example {{"a1" "a2"} {"b1" "b2"}} instead 
+   * of {{"a1","a2"},{"b1","b2"}}.
+   */
+! #define TCL_ARRAYS
+  
+  /*
+   * The following flag allows limiting the number of rows returned by a query.
+--- 312,318 ----
+   * of postgres C-like arrays, for example {{"a1" "a2"} {"b1" "b2"}} instead 
+   * of {{"a1","a2"},{"b1","b2"}}.
+   */
+! /* #define TCL_ARRAYS */
+  
+  /*
+   * The following flag allows limiting the number of rows returned by a query.
+
+-- 
+Massimo Dal Zotto
+
+----------------------------------------------------------------------+
+|  Massimo Dal Zotto                email:  dz@cs.unitn.it             |
+|  Via Marconi, 141                 phone:  ++39-461-534251            |
+|  38057 Pergine Valsugana (TN)     www:  http://www.cs.unitn.it/~dz/  |
+|  Italy                            pgp:  finger dz@tango.cs.unitn.it  |
+----------------------------------------------------------------------+
+
+