
General Information
|
Documentation
|
Help
|
Downloads
|
Developer Information
|
Other Information
|
Search
|
|
Using GnuCash With the SQL Backend
Below is a copy of src/backend/postgres/README.
GnuCash SQL/Postgres Support/HOWTO
----------------------------------
The SQL backend allows multiple users to view/edit the same data
at the same time, as well as allowing a single user to keep data
in an SQL database.
Status
------
The SQL code has been "finished" for many years, and hasn't been
in recent active development. This means that the current SQL
backend does not support some of the newer GnuCash features;
in particular, it doesn't support the business objects needed for
the business accounting. It also doesn't currently support
scheduled/recurring transactions. Although the development is
currently dormant, we believe that it may heat up real soon,
since there is a core group of developers who are interested
in making SQL the default for all future versions of GnuCash.
As long as development is dormant, note that you may not get
prompt support on the mailing list, and, if you find bugs, they
might not be fixed quickly.
Currently, only Postgres is supported, but other databases are
planned. See also the 'to be done' list at the bottom for more
detail.
Table of Contents
-----------------
-- Postgres Install Instructions
-- GnuCash Build Instructions
-- How To Use GnuCash with Postgres
-- Remote Access
-- Access Modes
-- Performance
-- ToDo List
Postgres Install Instructions
-----------------------------
1) Install PostgresSQL server, client and devel packages.
(Both the 6.x and the 7.x versions of Postgres should work).
2) If installed from RedHat, then running /etc/rc.d/init.d/postgresql
will setup and initialize basic Postgres first-time setup & config.
3) As root, 'su - postgres' then run 'createuser' to add your user id
(don't set a password on your postgres db name, yet, GnuCash doesn't
have a GUI to ask for your password yet)
If you've forgotten what users are on the system, you can check
by starting the 'psql' command shell and typing the command.
'SELECT * FROM pg_shadow;'
Note this only works for the postgres user, and other users that
have createuser permissions.
4) (optional) enable TCPIP connections to remote hosts. To do this:
edit the file pg_hba.conf on the remote host to configure and allow
access from other hosts. See 'man pg_hba.conf' for details.
RedHat: /var/lib/pgsql/pg_hba.conf
Debian: /etc/postgresql/pg_hba.conf
Also create users. Be sure to set a password.
(there appears to be no way to pass an empty password to postgres)
Note also the user must have 'createuser' permissions in order to
lock tables (this is a bug in postgres 6.5 and maybe later ???)
5) Hints and Tips:
If you've forgotten what databases you've created in the past,
you can look the filesystem: 'ls -la /var/lib/postgres/data/base'
to view the existing databases. Alternately, if you connect as
user postgres, you can 'SELECT * FROM pg_database;'
Alternatively, you can install into a brand new database without
using root privileges. Perform the following:
1) Install postgresql server, client and devel packages.
2) initdb -D ~/gnucash
This creates a directory structure and supporting files
under ~/gnucash. The gnucash directory is automatically
created. The directories and files are owned by the user
running 'initdb' and have only user access; group and
other are cleared.
By default, the user running 'initdb' is setup as the
database superuser for the new database.
You don't have to use the name 'gnucash'; you can choose
whatever you want.
3) pg_ctl -D ~/gnucash -o "-p 6543" start
This starts a postmaster process for the new database
listening on port 6543. If your system doesn't already
have a postmaster processing running for the system
database, you can leave out '-o "-p 6543"' and it will
use the default port 5432. You can use any open port.
By default, the postmaster process will only accept
connections from processes on the local host.
GnuCash Build Instructions
--------------------------
Same as usual, but you must specify the flag '--enable-sql' in order
to build in Postgres support.
i.e.
./configure --enable-sql
and then 'make' and then 'make install'
How To Use GnuCash with Postgres
--------------------------------
a) Open your favorite datafile in the usual fashion.
b) Click on 'Save As'
c) Enter the following URL instead of a filename in the file picker:
postgres://localhost/some_dbname_you_pick
The above steps will copy your data into that database. You can
then restart gnucash (or keep working) and type in the same URL
in the file open dialogs. Or try it on the command line:
home:~ $ gnucash postgres://localhost/dbname_whatever
Note: GnuCash will automatically create the database if it does
not already exist. Do *not* try to create the database by hand,
or to specify a database that wasn't created by GnuCash.
You can also "copy" from the SQL database to an XML file.
If you don't trust the SQL backend, this might be a good way to
backup your data. To do this:
a) Open the SQL database in "single user mode":
postgres://localhost/dbname_whatever?mode=single-update
It is important to specify the single-user mode
(if you don't, not all of your data will be saved to the file)
b) Click on 'Save As' and enter an ordinary file name.
We promise that future versions of GnuCash will be backwards
compatible with the current SQL database layout. There is a
specific upgrade module designed into the system that handles
version upgrades.
Remote Access
-------------
In principle, you can use URL's such as:
postgres://some.where.com/dbname
However, these URL's require that the host 'some.where.com' have
TCPIP access enabled (by following the postgres install instruction (6)
above). This is true even if the hostname is your local machine. Thus,
unless you've setup Postgres TCPIP connections, you *MUST* use
URL's of the form postgres://localhost/dbname
You can specify usernames and passwords in the URL as follows:
postgres://localhost/dbname?user=whomever&password=asdf
Note that the username and password are the pg_shadow username and
password, *NOT* your unix login name and password.
(A GUI dialog to prompt for your password hasn't been implemented yet.)
Note that you (or your SQL database admin) will have to make sure
that access permissions on the various GnuCash tables have been set
appropriately. Typically, this will be by starting the 'psql'
shell, listing all relations with \z, and then issuing
a 'GRANT ALL on gncAccount TO someuser;', and so on, for each
relation. Failing to do this will result in lots of unhappiness.
The sysadmin will also need to make sure that TCP/IP connections
are properly enabled in the postgres server hba.conf file.
Access Modes
------------
There are four different ways or 'modes' in which to access your
database. These are
mode=single-file
mode=single-update
mode=multi-user-poll
mode=multi-user
The first two are single-user access modes. These are the
'safest' modes to use if you don't need multi-user capability.
The single-update mode is strongly preferred over single-file.
(The single-file mode is strongly deprecated, and can lead to
to massive data loss under certain circumstances: for example,
if your sql connection dies after all the old data has been
deleted, but before the new data has been written out. Bluntly:
DO NOT USE mode=single-file. You won't be sorry.).
The multi-user modes are intended for use when multiple users
need to view and make changes to the data. The default mode
is multi-user.
If you have many users (more than a half-dozen or so), you may
want to run in multi-user-poll mode, instead of the default
multi-user. This will stop the automatic updates of the
local instance of gnucash, and thus reduce the amount of
GUI redraws that happen automatically as other users
edit data. It should also reduce network traffic
slightly, and present a slightly lower load on the
sql server. Different users can mix-n-match the
two multi-user modes.
See the file 'design.txt' for an in-depth explanation of these
modes.
You can specify a particular access mode by specifying the URL
postgres://localhost/dbname?mode=whatever
You can alternate between multi-user and single-user modes for
the same database. Just be sure that all users have logged off,
otherwise gnucash won't let you log on in single-user mode.
Users are 'logged off' automatically when they exist gnucash.
In case they have crashed, (and thus appear to still be logged
in), you can log them off manually by issuing the following:
echo "UPDATE gncsession SET time_off='NOW' WHERE time_off = 'infinity';" | psql dbname
Hopefully, you do not need to be warned that this dangerous
if there really are other users logged in. Accessing the
database in single-user mode when there are other users logged
in will result in their work being clobbered.
Other Options
-------------
The 'options=' and 'tty=' postgres keywords are supported. See the
postgres documentation for more info.
Mangled Balances
----------------
If you've been working in multi-user mode for a while, and the
balances seem to be mangled, log off all the users, log on the the
same database in single-update mode, and log off again. This
will cause balance checkpoints to be recomputed, and should
probably fix any weird account balances you are seeing.
(Mangled account balances shouldn't happen, and if you have
a test case that can reliably recreate these, you should
submit a bug report).
(Note to developers: think of this as a safe fsck: balances
shouldn't get wacky, but if you've been running a long time
and something weird has happened, then a periodic cleanup
is good to have. Maybe this should be automated ...)
Performance
-----------
In 'single-user-update' mode, data loads from the sql database
should be 1.5x faster than comparable loads from the XML flat file,
at least for medium datasets (measured at 3.5 seconds on a 700MHz
Athalon a dataset with 3K transactions and 150 accounts, vs. 4.8
seconds loading from file; postgres version 7.1.2).
Hitting the 'save' button is a no-op and takes no CPU cycles.
(date is saved as its modified, so a global save is not needed).
'exit' will stall for a few seconds, while a 'vacuum analyze' is
issued. The 'vacuum' reclaims storage, and 'analyze' does some
performance tuning. Doing this regularly improves performance
about 20% in the cases I looked at.
---------
Accounts are restored roughly at the rate of 50-75 per second
(for above hardware/software config).
---------
Performance in multi-user mode is still a can of worms, and will
be a good bit slower for now. The working assumptions are that
there are millions of transactions, so you can't load all of them.
But to load only a subset is hard, in several ways. First, its
technically hard to figure out what subset to load. Then, once
this is done, it takes about 12 millisecs to load one transaction
when loading them one at a time (on an Athalon 700MHz). This is
about 75 transactions a second. Of this, 90% of the CPU cycles
are burned in the postgres server, so 'basic' tuning won't do
the trick ... some different algorithm is needed.
Handy for raw-file-loading performance measurement is the the
script 'scan-acct.pl' in src/optional/swig/examples'
Weird shit: loading the same data, but in slightly different
order, can make a *huge* difference in the speed of balance
subtotal calculations (used only in multi-user mode.) Merely
rearranging the order of the splits in a transaction can cause
a factor of 20 (twenty) difference in performance. This can turn
a 4.5 second load into a minute & a half load !!!! Yow!
Investigating ...
To Be Done
----------
Core bugs/features that still need work:
High Priority:
--------------
Currently, the database backend is completely missing support for
"lots", and thus cannot support capital gains computations. It is
also missing support for business objects.
-- remove the bogus 'price query' call in the backend, change it to
use the qof query infrastructure.
-- port to use SQLLite as a possible backend. Will need to abandon
balance checkpoints to do this.
Medium/Low Priority Work:
-------------------------
Nothing in this list is truly critical at this point, except possibly
the username/password dialog, and its effect with regards to initial db
access.
-- port to use libdbi bindings, instead of the postgres native
call interface. This should be easy.
-- Wire in the GUI to ask user for username/password to log onto the
server. (GUI already implemented, just not yet used).
-- The correct URL format for username-password should be:
postgres://username:passwd@host:port/dbname?key=value&key2=val2
However, the username-password part of the URI is not correctly
parsed at the moment.
-- distinguish between 'save' and 'save-as' in gnc-book & backend.
If user hits 'save' in the single-user or multi-user mode, it
should be a no-op (since the saves have already occurred). Only
a 'save-as' requires a from-scratch sync.
?? Maybe this has already been fixed in the redesigned session
& backend ???
-- document how to use the version/extension tables to allow
seamless upwards compatibility in the face of database schema
changes.
-- allow user to enter URL in GUI dialog. User can currently type URL
into the file dialog window; it would be nice to have something
slightly nicer tan this.
-- error code should include strings passed back, to be shown in
GUI dialogs. This is because the backend needs to return things
like usernames, etc. in the dialogs, and the backend doesn't
have the interfaces for passing this kind of info.
(actually, the PERR/PINFO strings might do ??)
In some cases, we have PERR without setting a backend error...
grep for all PERR's that don't set backend error.
-- note that transaction commit errors may in fact be i/o errors.
If an i/o error occurred during commit, there would be some
aberrant rollback behavior. Ouch.
-- the transaction rollback code needs to be a GUI popup...
-- review (actually, design & architect) the communications error
handling policy. For example, CopyToEngine() will leave
the backend in a disabled state if a communication error occurs;
there will be other debilitating conditions if the backend disappears,
leaving the engine in a possibly confused state.
-- during open of register window, a query of splits is typically run
twice (once during open -- LedgerDisplayInternal(), and again
during LedgerDisplayRefresh()) this results in twice as much data
to the sql backend as needed. This is a performance issue,
should someday be fixed.
-- during transaction edit, query of splits is run twice --
once during edit recordCB()->xaccSRSaveRegEntry(), and again
recordCB()->xaccSRRedrawReg(). As above, this is performance
problem waiting to aggravate.
-- minor optimization: KVP's for null strings store null strings,
this is a waste of time. This occurs primarily for transaction
memos, I think.
-- minor optimization: keeping an iguid cache will avoid sql queries.
also caching the count for single-user mode would be good
To Be Done, Part II
-------------------
This list only affects the multi-user and advanced/optional features.
Most of the items on this list are 'good-to-fix' but are not
truly critical. They may result in aberrations, but general
operation is probably OK.
-- when an account is edited, price lookups happen *twice* !!
once when opened for edit, and again when committed !!
(the first one is due to gnc_account_tree_refresh ()
called by gnc_account_window_create() near dialog-account.c:1642)
-- nice-to-have performance improvement:
many queries are needlessly duplicated, even in multi-user mode.
They are duplicated just in case some other user has modified
the data. But if we last checked just a few seconds ago,
we don't really need to update, not yet. We can wait a few
seconds ... doing this uniformly can cut down on sql traffic.
(This is particularly egregious for price queries). Implement this
by putting a datestamp in along with the version number. Let
the backend check the datestamp, and if its aged less than
15 seconds, skip doing the sql query.)
DONE --- but we need to detect redundant price queries .. ughh
-- if always in multi-mode, then initial checkpoints not set up.
They're not needed until there are a lot of transactions in the
system. Need to recompute checkpoints on some periodic basis.
-- NOTIFY is not updating main window balances ...
-- implement price change events
-- implement a version table so that we can be upwardly compatible
with future database changes.
-- use FOREIGN KEY to ensure that table integrity is maintained.
-- use table constraint ON DELETE to make sure we have no
dangling splits when a transaction is deleted. As added
benefit, this might simplify some logic.
-- during sync, detect and report conflicting edits to accounts
and transactions. See the notes for pgendSync() for details
as to what this is about. For the first pass, this is not a
serious issue; its a 'nice to have' thing. (sync is called
when user hits the 'save' button, and should be disabled for
multi-user modes)
-- implement account rollback (i.e. if other user has modified the
account, we need to do something to merge their work into ours...)
ditto for prices ...
-- bug: if another user deletes an account, we need to look at the
audit trail to see if the thing has been deleted.
Otherwise, any edit of this account will incorrectly
add the deleted account back in. (Note that from the user
perspective, deleting accounts is a bad idea ...)
(this is handled with an event, but is not handled via poll)
-- fix caching in the face of lost contact to the backend. If the
backend can't contact its server, then we should just save up caches,
and then when contact with backend re-established, we should spit
them out. The pgendSync routine should more or less do the trick;
note, however, that the XML file format needs to save the version
number ...
-- Implement various advanced database features, such as checking the
user's permission to view/edit account by account ... (hmmm this
done by the dbadmin... using SQL commands... which means if user
tries to write to something they're not allowed to write to,
then they should be bounced back.) Does some user have the permission
to create new accounts ??
-- Review versioning verification in backend. The desired semantic for
updates should be like CVS: multiple nearly-simultaneous writers
are allowed; however, only one wins, and others are kicked back.
The losers know themselves because they are trying to update info
of the wrong version.
-- pgend_transaction_commit does it correctly; but the GUI doesn't
report a rollback. (need to get err message out of engine, into
GUI).
-- pgTransactionSync() is broken, but its not used anywhere.
-- pgend_account_commit checks version but doesn't rollback.
(nor does the GUI report a rollback.)
-- pgendSync does the right thing, except that it doesn't
detect any version conflicts, nor does it notify the user
of such conflicts. (actually, it also screws checkpoints).
I'm not sure how critical this all is; with a small number of users
it shouldn't be a problem. With a huge number of users, each editing
the same transaction (unlikely!?) then there is risk of clobbered
data, but so what? versioning is at least partly a people-management
problem. Anyway, what's there now should be pretty good & should
work for now. Except its mostly untested.
-- use version numbers for commodities. Right now, multi-user
updates of commodities are not detected (this seem OK for
now, since this is a rare occurrence, right ???)
-- multi-user 'save-as' has unexpected results:
As it currently works, save-as (sync) is a mass-copy of all data
out of the engine into the named storage location. If the indicated
storage location doesn't exist, its created, and all of the engine
data is copied into it. If the storage location (i.e. database)
does exist, and contains data, then the engine contents are merged
into it.
From the engine point of view, the above is the easiest thing to do.
However, in the multi-user mode, it may not be what the user is
expecting. In the multi-user mode, the engine would not contain
a full copy of the database data; instead, it contains only a subset.
A 'save as', if implemented as above, would save only that subset,
instead of all of the data. If the user thought 'save as'
means the same thing as 'copy', and hoped to copy all of the sql
data to a file, they would be sorely disappointed. I don't
currently have an opinion on the 'best' way of dealing with this
situation. A reasonable solution may well be 'don't do that'.
-- Implement PR_BALANCE query type in gncquery.c. This query is
supposed to return unbalanced transactions from the database.
I think it might be better to leave this unbalanced, and just
plain force everything in the database to be balanced. In this
case, we should then modify the code to check for unbalanced
transactions before committing them, or otherwise force the backend
to run in balanced mode. We could set 'force_double_entry=2'
but this is currently unimplemented in the engine. Alternately,
we could call 'Scrub' at appropriate times.
============================= END OF FILE ========================
|