|
6.2.1 How to emulate the Oracle
decode function/crosstab
6.2.2 How to implement if-then-else
within a select-clause.
6.2.3 deleted due to copyright hassles with the publisher
6.2.4 How to pad with leading zeros an int
or smallint.
6.2.5 Divide by zero and nulls.
6.2.6 Convert months to financial
months.
6.2.7 Hierarchy traversal - BOMs.
6.2.8 Is it possible to call a UNIX
command from within a stored procedure or a trigger?
6.2.9 Information on Identities and
Rolling your own Sequential Keys
6.2.10 How can I execute dynamic SQL with ASE
6.2.11 Is it possible to concatenate all the values from a column and return a single row?
6.2.12 Selecting rows N to M without Oracle's rownum?
6.2.13 How can I return number of rows that are returned from a grouped query without using a temporary table?
If you are using ASE version 11.5 or later, the simplest way to
implement the Oracle decode is with the CASE statement. The following
code snippet should be compared with the example using a characteristic
function given
below .
SELECT STUDENT_ID,
(CASE WHEN COURSE_ID = 101 THEN 1 ELSE 0 END) AS COURSE_101,
(CASE WHEN COURSE_ID = 105 THEN 1 ELSE 0 END) AS COURSE_105,
(CASE WHEN COURSE_ID = 201 THEN 1 ELSE 0 END) AS COURSE_201,
(CASE WHEN COURSE_ID = 210 THEN 1 ELSE 0 END) AS COURSE_210,
(CASE WHEN COURSE_ID = 300 THEN 1 ELSE 0 END) AS COURSE_300
GROUP BY STUDENT_ID
ORDER BY STUDENT_ID
However, if you have a version of ASE that does not support the case
statement, then you will have to try the following. There may be other
reasons to try characteristics functions. If you go to the Amazon web site
and look for reviews for of Rozenshteins book, Advanced SQL, you will see
that one reviewer believes that a true crosstab is not possible with the
case statement. I am not sure. I have also not done any performance tests
to see which is quicker.
There is a neat way to use boolean logic to perform cross-tab or
rotation queries easily, and very efficiently. Using the aggregate 'Group
By' clause in a query and the ISNULL(), SIGN(), ABS(), SUBSTRING() and
CHARINDEX() functions, you can create queries and views to perform all
kinds of summarizations.
This technique does not produce easily understood SQL
statements.
If you want to test a field to see if it is equal to a value, say 100,
use the following code:
SELECT (1- ABS( SIGN( ISNULL( 100 - <field>, 1))))
The innermost function will return 1 when the field is null, a positive
value if the field < 100, a negative value if the field is > 100 and
will return 0 if the field = 100. This example is for Sybase or Microsoft
SQL server, but other servers should support most of these functions or the
COALESCE() function, which is the ANSI equivalent to ISNULL.
The SIGN() function returns zero for a zero value, -1 for a negative
value, 1 for a positive value The ABS() function returns zero for a zero
value, and > 1 for any non-zero value. In this case it will return 0 or
1 since the argument is the function SIGN(), thus acting as a binary
switch.
Put it all together and you get '0' if the value match, and '1' if they
don't. This is not that useful, so we subtract this return value from '1'
to invert it, giving us a TRUE value of '1' and a false value of '0'. These
return values can then be multiplied by the value of another column, or
used within the parameters of another function like SUBSTRING() to return a
conditional text value.
For example, to create a grid from a student registration table
containing STUDENT_ID and COURSE_ID columns, where there are 5 courses
(101, 105, 201, 210, 300) use the following query:
Compare this version with the case
statement above.
SELECT STUDENT_ID,
(1- ABS( SIGN( ISNULL( 101 - COURSE_ID, 1)))) COURSE_101,
(1- ABS( SIGN( ISNULL( 105 - COURSE_ID, 1)))) COURSE_105,
(1- ABS( SIGN( ISNULL( 201 - COURSE_ID, 1)))) COURSE_201,
(1- ABS( SIGN( ISNULL( 210 - COURSE_ID, 1)))) COURSE_210,
(1- ABS( SIGN( ISNULL( 300 - COURSE_ID, 1)))) COURSE_300
GROUP BY STUDENT_ID
ORDER BY STUDENT_ID
Back to top
ASE 11.5 introduced the case statement, which can be used to
replace a lot of this 'trick' SQL with more readable (and standard) code.
With a case statement, an if then else is as easy as:
declare @val char(20)
select @val = 'grand'
select case when @val = 'small' then
'petit'
else
'grand'
end
However, quite a number of people are still using pre-11.5
implementations, including those people using the free 11.0.3.3 Linux
release. In that case you can use the following recipe.
To implement the following condition in a select clause:
if @val = 'small' then
print 'petit'
else
print 'grand'
fi
in versions of ASE prior to 11.5 do the following:
select isnull(substring('petit', charindex('small', @val), 255), 'grand')
To test it out, try this:
declare @val char(20)
select @val = 'grand'
select isnull(substring('petit', charindex('small', @val), 255), 'grand')
This code is not readily understandable by most programmers, so remember
to comment it well.
Back to top
6.2.3: Removed
By example:
declare @Integer int
/* Good for positive numbers only. */
select @Integer = 1000
select "Positives Only" =
right( replicate("0", 12) + convert(varchar, @Integer), 12)
/* Good for positive and negative numbers. */
select @Integer = -1000
select "Both Signs" =
substring( "- +", (sign(@Integer) + 2), 1) +
right( replicate("0", 12) + convert(varchar, abs(@Integer)), 12)
select @Integer = 1000
select "Both Signs" =
substring( "- +", (sign(@Integer) + 2), 1) +
right( replicate("0", 12) + convert(varchar, abs(@Integer)), 12)
go
Produces the following results:
Positives Only
--------------
000000001000
Both Signs
-------------
-000000001000
Both Signs
-------------
+000000001000
Back to top
During processing, if a divide by zero error occurs you will not get the answer you
want. If you want the result set to come back and null to be displayed where divide by
zero occurs do the following:
1> select * from total_temp
2> go
field1 field2
----------- -----------
10 10
10 0
10 NULL
(3 rows affected)
1> select field1, field1/(field2*convert(int,
substring('1',1,abs(sign(field2))))) from total_temp
2> go
field1
----------- -----------
10 1
10 NULL
10 NULL
Back to top
To convert months to financial year months (i.e. July = 1, Dec = 6, Jan = 7, June = 12
)
Method #1
select ... ((sign(sign((datepart(month,GetDate())-6) * -1)+1) *
(datepart(month, GetDate())+6))
+ (sign(sign(datepart(month, GetDate())-7)+1) *
(datepart(month, GetDate())-6)))
...
from ...
Method #2
select charindex(datename(month,getdate()),
" July August September October November December
January Febuary March April May June "
) / 10
In the above example, the embedded blanks are significant.
Back to top
Alright, so you wanna know more about representing hierarchies in a relational
database? Before I get in to the nitty gritty I should at least give all of the credit for
this algorithm to: "_Hierarical_Structures:_The_Relational_Taboo!_, _(Can_
Transitive_Closure_Queries_be_Efficient?)_", by Michael J. Kamfonas as published in
1992 "Relational Journal" (I don't know which volume or issue).
The basic algorithm goes like this, given a tree (hierarchy) that looks roughly like
this (forgive the ASCII art--I hope you are using a fixed font to view this):
a
/ \
/ \
/ \
b c
/ \ /|\
/ \ / | \
/ \ / | \
d e f | g
Note, that the tree need not be balanced for this algorithm to work.
The next step assigned two numbers to each node in the tree, called left and right
numbers, such that the left and right numbers of each node contain the left and right
numbers of the ancestors of that node (I'll get into the algorithm for assigning these
left and right numbers later, but, hint: use a depth-first search):
1a16
/ \
/ \
/ \
2b7 8c15
/ \ /|\
/ \ / | \
/ \ / | \
3d4 5e6 9f10 11g12 13h14
Side Note: The careful observer will notice that these left and right numbers look an
awful lot like a B-Tree index.
So, you will notice that all of the children of node 'a' have left and right numbers
between 1 and 16, and likewise all of the children of 'c' have left and right numbers
between 8 and 15. In a slightly more relational format this table would look like:
Table: hier
node parent left_nbr right_nbr
----- ------ -------- ---------
a NULL 1 16
b a 2 7
c a 8 15
d b 3 4
e b 5 6
f c 9 10
g c 11 12
h c 13 14
So, given a node name, say @node (in Sybase variable format), and you want to know all
of the children of the node you can do:
SELECT h2.node
FROM hier h1,
hier h2
WHERE h1.node = @node
AND h2.left_nbr > h1.left_nbr
AND h2.left_nbr < h1.right_nbr
If you had a table that contained, say, the salary for each node in your hierarchy
(assuming a node is actually a individual in a company) you could then figure out the
total salary for all of the people working underneath of @node by doing:
SELECT sum(s.salary)
FROM hier h1,
hier h2,
salary s
WHERE h1.node = @node
AND h2.left_nbr > h1.left_nbr
AND h2.right_nbr > h1.right_nbr
AND s.node = h2.node
Pretty cool, eh? And, conversely, if you wanted to know how much it cost to manage
@node (i.e. the combined salary of all of the boss's of @node), you can do:
SELECT sum(s.salary)
FROM hier h1,
hier h2,
salary s
WHERE h1.node = @node
AND h2.left_nbr < h1.left_nbr
AND h2.left_nbr > h1.right_nbr
AND s.node = h2.node
Now that you can see the algorithm in action everything looks peachy, however the
sticky point is the method in which left and right numbers get assigned. And,
unfortunately, there is no easy method to do this relationally (it can be done, it just
ain't that easy). For an real- world application that I have worked on, we had an external
program used to build and maintain the hierarchies, and it was this program's
responsibility to assign the left and right numbers.
But, in brief, here is the algorithm to assign left and right numbers to every node in
a hierarchy. Note while reading this that this algorithm uses an array as a stack, however
since arrays are not available in Sybase, they are (questionably) emulated using a temp
table.
DECLARE @skip int,
@counter int,
@idx int,
@left_nbr int,
@node varchar(10)
/*-- Initialize variables --*/
SELECT @skip = 1000, /* Leave gaps in left & right numbers */
@counter = 0, /* Counter of next available left number */
@idx = 0 /* Index into array */
/*
* The following table is used to emulate an array for Sybase,
* for Oracle this wouldn't be a problem. :(
*/
CREATE TABLE #a (
idx int NOT NULL,
node varchar(10) NOT NULL,
left_nbr int NOT NULL
)
/*
* I know that I always preach about not using cursors, and there
* are ways to get around it, but in this case I am more worried
* about readability over performance.
*/
DECLARE root_cur CURSOR FOR
SELECT h.node
FROM hier h
WHERE h.parent IS NULL
FOR READ ONLY
/*
* Here we are populating our "stack" with all of the root
* nodes of the hierarchy. We are using the cursor in order
* to assign an increasing index into the "stack"...this could
* be done using an identity column and a little trickery.
*/
OPEN root_cur
FETCH root_cur INTO @node
WHILE (@@sqlstatus = 0)
BEGIN
SELECT @idx = @idx + 1
INSERT INTO #a VALUES (@idx, @node, 0)
FETCH root_cur INTO @node
END
CLOSE root_cur
DEALLOCATE CURSOR root_cur
/*
* The following cursor will be employed to retrieve all of
* the children of a given parent.
*/
DECLARE child_cur CURSOR FOR
SELECT h.node
FROM hier h
WHERE h.parent = @node
FOR READ ONLY
/*
* While our stack is not empty.
*/
WHILE (@idx > 0)
BEGIN
/*
* Look at the element on the top of the stack.
*/
SELECT @node = node,
@left_nbr = left_nbr
FROM #a
WHERE idx = @idx
/*
* If the element at the top of the stack has not been assigned
* a left number yet, then we assign it one and copy its children
* on the stack as "nodes to be looked at".
*/
IF (@left_nbr = 0)
BEGIN
/*
* Set the left number of the current node to be @counter + @skip.
* Note, we are doing a depth-first traversal, assigning left
* numbers as we go.
*/
SELECT @counter = @counter + @skip
UPDATE #a
SET left_nbr = @counter
WHERE idx = @idx
/*
* Append the children of the current node to the "stack".
*/
OPEN child_cur
FETCH child_cur INTO @node
WHILE (@@sqlstatus = 0)
BEGIN
SELECT @idx = @idx + 1
INSERT INTO #a VALUES (@idx, @node, 0)
FETCH child_cur INTO @node
END
CLOSE child_cur
END
ELSE
BEGIN
/*
* It turns out that the current node already has a left
* number assigned to it, so we just need to assign the
* right number and update the node in the actual
* hierarchy.
*/
SELECT @counter = @counter + @skip
UPDATE h
SET left_nbr = @left_nbr,
right_nbr = @counter
WHERE h.node = @node
/*
* "Pop" the current node off our "stack".
*/
DELETE #a WHERE idx = @idx
SELECT @idx = @idx - 1
END
END /* WHILE (@idx > 0) */
DEALLOCATE CURSOR child_cur
While reading through this, you should notice that assigning the left and right numbers
to the entire hierarchy is very costly, especially as the size of the hierarchy grows. If
you put the above code in an insert trigger on the hier table, the overhead for inserting
each node would be phenomenal. However, it is possible to reduce the overall cost of an
insertion into the hierarchy.
- By leaving huge gaps in the left & right numbers (using the @skip variable), you can
reduce the circumstances in which the numbers need to be reassigned for a given insert.
Thus, as long as you can squeeze a new node between an existing pair of left and right
numbers you don't need to do the re-assignment (which could affect all of the node in the
hierarchy).
- By keeping an extra flag around in the hier table to indicate which nodes are leaf nodes
(this could be maintained with a trigger as well), you avoid placing leaf nodes in the
array and thus reduce the number of updates.
Deletes on this table should never cause the left and right numbers to be re-assigned
(you could even have a trigger automagically re-parent orphaned hierarchy nodes).
All-in-all, this algorithm is very effective as long as the structure of the hierarchy
does not change very often, and even then, as you can see, there are ways of getting
around a lot of its inefficiencies.
Back to top
11.5 and above
The Adaptive Server (11.5) will allow O/S calls from within stored procedures and
triggers. These stored procedures are known as extended stored procedures.
Pre-11.5
Periodically folks ask if it's possible to make a system command or call a UNIX process
from a Trigger or a Stored Procedure.
Guaranteed Message Processing
The typical ways people have implemented this capability is:
- Buy Open Server and bind in your own custom stuff (calls to system() or custom C code)
and make Sybase RPC calls to it.
- Have a dedicated client application running on the server box which regularly scans a
table and executes the commands written into it (and tucks the results into another table
which can have a trigger on it to gather results...). It is somewhat tricky but cheaper
than option 1.
Sybase ASE 10.0.2.5 and Above - syb_sendmsg()
This release includes a new built-in function called syb_sendmsg(). Using this function
you can send a message up to 255 bytes in size to another application from the ASE.
The arguments that need to be passed to syb_sendmsg() are the IP address and port number
on the destination host, and the message to be sent. The port number specified can be any
UDP port, excluding ports 1-1024, not already in use by another process. An example is:
1> select syb_sendmsg("120.10.20.5", 3456, "Hello")
2> go
This will send the message "Hello" to port 3456 at IP address '120.10.20.5'.
Because this built-in uses the UDP protocol to send the message, the ASE does not
guarantee the receipt of the message by the receiving application.
Also, please note that there are no security checks with this new function. It
is possible to send sensitive information with this command and Sybase strongly recommends
caution when utilizing syb_sendmsg to send sensitive information across the network. By
enabling this functionality, the user accepts any security problems which result from its
use (or abuse).
To enable this feature you should run the following commands as the System Security
Officer.
- Login to the ASE using 'isql'.
- Enable the syb_sendmsg() feature using sp_configure.
1> sp_configure "allow sendmsg", 1
2> go
1> sp_configure "syb_sendmsg port number", <port number>
2> go
1> reconfigure with override -- Not necessary with 11.0 and above
2> go
The server must be restarted to set the port number.
Using syb_sendmsg() with Existing Scripts
Since syb_sendmsg() installs configuration parameter "allow sybsendmsg",
existing scripts that contain the syntax
1> sp_configure allow, 1
2> go
to enable updates to system tables should be altered to be fully qualified as in the
following:
1> sp_configure "allow updates", 1
2> go
If existing scripts are not altered they will fail with the following message:
1> sp_configure allow, 1
2> go
Configuration option is not unique.
duplicate_options
----------------------------
allow updates
allow sendmsg
(return status = 1)
(The above error is a little out of date for the latest releases of ASE, there are now
8 rows that contain "allow", but the result is the same.)
Backing Out syb_sendmsg()
The syb_sendmsg() function requires the addition on two config values. If it becomes
necessary to roll back to a previous ASE version which does not include
syb_sendmsg(), please follow the instructions below.
- Edit the RUNSERVER file to point to the SWR ASE binary you wish to use.
- isql -Usa -P<sa password> -Sserver_name -n -iunconfig.sendmsg -ooutput_file
Sample C program
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <fcntl.h>
main(argc, argv)
int argc; char *argv[];
{
struct sockaddr_in sadr;
int portnum,sck,dummy,msglen;
char msg[256];
if (argc <2) {
printf("Usage: udpmon <udp portnum>\n");
exit(1);
}
if ((portnum=atoi(argv[1])) <1) {
printf("Invalid udp portnum\n");
exit(1);
}
if ((sck="socket(AF_INET,SOCK_DGRAM,IPPROTO_UDP))" < 0) {
printf("Couldn't create socket\n");
exit(1);
}
sadr.sin_family = AF_INET;
sadr.sin_addr.s_addr = inet_addr("0.0.0.0");
sadr.sin_port = portnum;
if (bind(sck,&sadr,sizeof(sadr)) < 0) {
printf("Couldn't bind requested udp port\n");
exit(1);
}
for (;;)
{
if((msglen="recvfrom(sck, msg, sizeof(msg), 0, NULL, &dummy))" < 0)
printf("Couldn't recvfrom() from udp port\n");
printf("%.*s\n", msglen, msg);
}
}
Back to top
This has several sections, culled from various sources. It is better described as
"Everything you've ever wanted to know about identities." It will serve to
answer the following frequently asked questions:
What are the Features and Advantages of using Identities?
What are the Problems with and Disadvantages of Identities?
Common Questions about Identities
How do I optimize the performance of a table that
uses Identities?
How do I recover from a huge gap in my identity
column?
How do I fix a table that has filled up its identity
values?
OK, I hate identities. How do I generate sequential
keys without using the Identity feature?
How do I optimize a hand-made sequential key system
for best performance?
- Question 8.1 of the comp.database.sybase
FAQ has a quick blurb about identities and sequential numbers. Search down in the page
for the section titled, "Generating Sequential Numbers." Question 8.1 is a
general document describing Performance and Tuning topics to be considered and thus
doesn't go into as much detail as this page.
- There's a white paper by Malcolm Colton available from the sybase web site. Goto the
Sybase web site http://www.sybase.com and type Surrogate
in the search form. Select the Surrogate Primary Keys, Concurrency, and the Cache Hit
Ratio document.
Advantages/Features of Using Identities
There's an entire section devoted to Identity columns in the ASE Reference
manual, Chapter 5
Sybase System 10 introduced many changes over the 4.9.x architecture. One of these
changes was the Identity feature. The identity column is a special column type that gets
automatically updated by the server upon a new row insert. Its purpose is to guarantee a
unique row identifier not based on the other data in the row. It was integrated with the
server and made memory based for fast value retrieval and no locking (as was/is the case
with homegrown sequential key generation schemes).
The Advantages and Features of Identities include:
- A non-SQL based solution to the problem of having an default unique value assigned to a
row. ASE prefetches identity values into cache and adds them automatically to rows
as they're inserted into tables that have a type Identity column. There's no concurrency
issues, no deadlocking in high-insert situations, and no possibility of duplicate values.
- A high performance Unique identifier; ASE's optimizer is tuned to work well with
Unique indexes based on the identity value.
- The flexibility to insert into the identity field a specific value in the case of a
mistaken row deletion. (You can never update however). You accomplish this by:
1> set identity_insert [datababase]..[table] on
2> go
Note however that the System will not verify the uniqueness of the value you
specifically insert (unless of course you have a unique index existing on the identity
column).
- The flexibility during bcp to either retain existing identity values or to reset them
upon bcping back in. To retain the specific identity values during a bcp out/in process,
bcp your data out normally (no special options). Then create your bcp in target table with
ddl specifying the identity column in the correct location. Upon bcp'ing back in, add the
"-E" option at the end of the bcp line, like this (from O/S prompt):
% bcp [database]..[new_table] in [bcp datafile] -Usa -S[server] -f [fmt file] -E
For procedures on resetting identity values during a bcp, see the section regarding Identity gaps.
Databasewide Identity options: 1) The ability to set Sybase to automatically create an
Identity column on any table that isn't created with a primary key or a unique constraint
specified. 2) Sybase can automatically include an Identity field in all indexes created,
guaranteeing all will be unique. These two options guarantee increased index performance
optimization and guarantees the use of updateable cursors and isolation level 0 reads.
These features are set via sp_dboption, like this:
1> sp_dboption [dbname], "auto identity", true
2> go
or
1> sp_dboption [dbname], "identity in nonunique index", true
2> go
To tune the size of the auto identity (it defaults to precision 10):
1> sp_configure "size of auto identity", [desired_precision]
2> go
(the identity in nonunique index db_option and the size of auto identity sp_configure
value are new with System 11: the auto identity existed with the original Identity feature
introduction in System 10)
Like other dboptions, you can set these features on the model database before creating new
databases and all your future databases will be configured. Be warned of the pitfalls of
large identity gaps however; see the question regarding Burn
Factor Vulnerability in the Common Questions about
Identities section.
The existence of the @@identity global variable, which keeps track of the identity value
assigned during the last insert executed by the server. This variable can be used
programming SQL around tables that have identity values (in case you need to know what the
last value inserted was). If the last value inserted in the server was to a non-identity
table, this value will be "0."
Back to start of 6.2.9
Disadvantages/Drawbacks of Using Identities
Despite its efficacy of use, the Identity has some drawbacks:
Identity
Optimization section in more detail, is that clustering on another field doesn't truly
resolve the concurrency issues. The hot spot simply moves from the last data page to the
last non-clustered index page of the index created on the Identity column.
If you fill up your identity values, no more inserts can occur. This can be a big
problem, especially if you have a large number of inserts and you have continually crashed
your server. However this problem most often occurs when you try to alter a table and add
an Identity column that's too small, or if you try to bcp into a table with an identity
column thetas too small. If this occurs, follow the procedures for recovering from identity gaps.
I've heard (but not been able to reproduce) that identities jump significantly when
dumping and loading databases. Not confirmed.
NOTE: there are several other System 11 bugs related to Identities. EBF 7312 fixes
BugId 97748, which caused duplicate identity values to be inserted at times. EBF 6886
fixed (in addition to the above described bug) an odd bug (#82460) which caused a server
crash when bcping into a table w/ an identity added via alter table. As always, try to
stay current on EBFs.
Back to start of 6.2.9
Common questions about Identities
Is the Identity the equivalent of Oracle's auto-sequencing?:
Answer: More or less yes. Oracle's auto-sequencing feature is somewhat transparent to
the end user and automatically increments if created as a primary key upon a row insert.
The Sybase Identity column is normally specified at table creation and thus is a
functional column of the table. If however you set the "auto identity" feature
for a database, the tables created will have a "hidden" identity column that
doesn't even appear when you execute a select * from [table]. See the Advantages of Identities for more details.
- How do I configure Identities?: You can either create your table initially with
the identity column:
1> create table ident_test
2> (text_field varchar(10),
3> ident_field numeric(5,0) identity)
4> go
Or alter an existing table and add an identity column:
1> alter table existing_table
2> add new_identity_field numeric(7,0) identity
3> go
When you alter a table and add an identity column, the System locks
the table while systematically incrementing and adding unique values to
each row. IF YOU DON'T SPECIFY a precision, Sybase defaults the size to
18! Thats 1,000,000,000,000,000,000-1 possible values and some major
major problems if you ever crash your ASE and burn a default number of
values... (10^18 with the default burn factor will burn 5^14 or
500,000,000,000,000 values...yikes).
- How do I Configure the burn factor?: The number of identity values that gets
"burned" upon a crash or a shutdown can by found by logging into the server and
typing:
1> sp_configure "identity burning set factor"
2> go
the Default value set upon install is 5000. The number
"5000" in this case is read as ".05% of all the
potential identity values you can have in this particular case will be
burned upon an unexpected shutdown." The actual number depends on
the size of the identity field as you specified it when you created
your table. To set the burn factor, type:
1> sp_configure "identity burning set factor", [new value]
2> go
This is a static change; the server must be rebooted before it takes
effect.
- How do I tell which tables have identities?: You can tell if a table has
identities one of two ways:
- sp_help [tablename]: there is a field included in the sp_help output describing a table
called "Identity." It is set to 1 for identity fields, 0 otherwise.
- Within a database, execute this query:
1> select object_name(id) "table",name "column", prec "precision"
2> from syscolumns
3> where convert(bit, (status & 0x80)) = 1
4> go
this will list all the tables and the field within the table that serves as an
identity, and the size of the identity field.
- What is my identity burn factor vulnerability right now?:
In other words, what would happen to my tables if I crashed my server right now?
Identities are created type numeric, scale 0, and precision X. A precision of 9 means the
largest identity value the server will be able to process is 10^9-1, or 1,000,000,000-1,
or 999,999,999. However, when it comes to Burning identities, the server will burn (based
on the default value of 5000) .05% of 1,000,000,000 or 500,000 values in the case of a
crash. (You may think an identity precision allowing for 1 Billion rows is optimistic, but
I once saw a precision set at 14...then the database crashed and their identity values
jumped 5 TRILLION. Needless to say they abandoned their original design. Even worse, SQL
server defaults precision to 18 if you don't specify it upon table creation...that's a
MINIMUM 10,000,000,000 jump in identity values upon a crash with the absolute minimum burn
factor)
Lets say you have inserted 5 rows into a table, and then you crash your server and then
insert 3 more rows. If you select all the values of your identity field, it will look like
this:
1> select identity_field from id_test
2> go
identity_field
--------------
1
2
3
4
5
500006
500007
500008
(8 rows affected)
Here's your Identity burning options (based on a precision of 10^9 as above):
Burn value % of values # values burned during crash
5000 .05% 500,000
1000 .01% 100,000
100 .001% 10,000
10 .0001% 1,000
1 .00001% 100
So, the absolute lowest amount of numbers you'll burn, assuming you configure the burn
factor down to 1 (sp_configure "identity burning set factor", 1) and a precision
of 9, is 100 values.
Back to start of 6.2.9
Optimizing your Identity setup for
performance and maintenance
If you've chosen to use Identities in your database, here are some configuration tips to
avoid typical Identity pitfalls:
- Tune the burn factor!: see the
vulnerability
section for a discussion on what happens to identity values upon ASE crashes. Large
jumps in values can crash front ends that aren't equipped to handle and process numbers
upwards of 10 Trillion. I've seen Powerbuilder applications crash and/or not function
properly when trying to display these large identity values.
- Run update statistics often on tables w/ identities: Any index with an identity
value as the first column in the search condition will have its performance severely
hampered if Update statistics is not run frequently. Running a nightly update
statistics/sp_recompile job is a standard DBA task, and should be run often regardless of
the existence of identities in your tables.
- Tune the "Identity Grab Size": ASE defaults the number of
Identity values it pre-fetches to one (1). This means that in high insert environments the
Server must constantly update its internal identity placeholder structure before adding
the row. By tuning this parameter up:
1> sp_configure "identity grab size", [number]
2> go
You can prefetch larger numbers of values for each user as they log into the server an
insert rows. The downside of this is, if the user doesn't use all of the prefetched block
of identity values, the unused values are lost (seeing as, if another user logs in the
next block gets assigned to him/her). This can quickly accelerate the depletion of
identity values and can cause gaps in Identity values.
(this feature is new with System 11)
- Do NOT build business rules around Identity values. More generally speaking the
recommendation made by DBAs is, if your end users are EVER going to see the identity field
during the course of doing their job, then DON'T use it. If your only use of the Identity
field is for its advertised purpose (that being solely to have a uniquely identifying row
for a table to index on) then you should be fine.
- Do NOT build your clustered index on your Identity field, especially if you're
doing lots of inserts. This will create a hot spot of contention at the point of
insertion, and in heavier OLTP environments can be debilitating.
- There is an excellent discussion in document
http://www.sybase.com/detail?id=860
on the performance and tuning aspects of Identities. It
supplements some of the information located here (Note: this will open
in a new browser window).
Back to start of 6.2.9
Recovery from Large Identity value gaps or
Recovery from Identity insert errors/Full Identity tables
This section will discuss how to re-order the identity values for a table following a
crash/abnormal shutdown that has resulted in huge gaps in the values. The same procedure
is used in cases where the identity field has "filled up" and does not allow
inserts anymore. Some applications that use Identities are not truly candidates for this
process (i.e., applications that depend on the identity field for business purposes as
opposed to simple unique row identifiers). Applications like this that wish to rid their
dependence on identities will have to re-evaluate their database design.
- Method 1:bcp out and in:
- First, (from O/S command line):
% bcp database..table out [data_file] -Usa -S[server] -N
This will create a binary bcp datafile and will force the user to create a .fmt file.
The -N option tells the server to skip the identity field while bcp'ing out.
- drop and recreate the table in question from ddl (make sure your table ddl specifies the
identity field).
- Now bcp back in:
% bcp database.table in [data_file -Usa -S[server] -f[fmt file] -N
The -N option during bcp in tells the server to ignore the data file's placeholder
column for the defined identity column.
Coincidentally, if you bcp out w/o the -N option, drop the table, recreate from ddl
specifying the identity field, and bcp back in w/o the -N option, the same effect as above
occurs.
(note: if you bcp out a table w/ identity values and then want to preserve the identity
values during the bcp back in, use the "-E" option.)
- Method 2: select into a new table, adding the identity column as you go : Follow
this process:
1> select [all columns except identity column]
2> [identity column name ] = identity(desired_precision)
3> into [new_table]
4> from [old table]
5> go
- There are alternate methods that perform the above in multi steps, and might be more
appropriate in some situations.
- You can bcp out all the fields of a table except the identity column (create the bcp
format file from the original table, edit out the identity column, and re-bcp). At this
point you can create a new table with or without the identity column; if you create it
with, as you bcp back in the Server will assign new identity values. If you create it
without, you can bcp back in normally and then alter the table and add the identity later.
- You can select all columns but the identity into a new table, then alter that table and
add an identity later on.
Back to start of 6.2.9
How do I generate Sequential Keys w/o the
Identity feature?
There are many reasons not to use the Identity feature of Sybase. This section will
present several alternative methods, along with their advantages and drawbacks. The
methods are presented in increasing order of complexity. The most often implemented is
Method 3, which is a more robust version of Method 2 and which uses a surrogate-key
storage table.
Throughout this section the test table I'm adding lines to and generating sequential
numbers for is table inserttest, created like this:
1> create table inserttest
2> (testtext varchar(25), counter int)
3> go
- Method 1: Create your table with a column called counter of type int. Then, each time
you insert a row, do something like this:
1> begin tran
2> declare @nextkey int
3> select @nextkey=max(counter)+1 from inserttest holdlock
4> insert inserttest (testtext,counter) values ("test_text,@nextkey")
5> go
1> commit tran
2> go
This method is rather inefficient, as large tables will take minutes to return a
max(column) value, plus the entire table must be locked for each insert (since the max()
will perform a table scan). Further, the select statement does not guarantee an exclusive
lock when it executes unless you have the "holdlock" option; so either duplicate
values might be inserted to your target table or you have massive deadlocking.
- Method 2: See Question 10.1.1 of the comp.database.sybase FAQ is the May 1994 (Volume
3, Number 2) Sybase Technical Note (these links will open in a new browser window). Search
down in the tech note for the article titled, "How to Generate Sequential Keys for
Table Key Columns." This has a simplistic solution that is expanded upon in Method 3.
- Method 3: Create a holding table for keys in a common database: Here's our central
holding table.
1> create table keystorage
2> (tablename varchar(25),
4> lastkey int)
5> go
And initially populate it with the tablenames and last values inserted (enter in a 0
for tables that are brand new).
1> insert into keystorage (tablename,lastkey)
2> select "inserttest", max(counter) from inserttest
3> go
Now, whenever you go to insert into your table, go through a process like this:
1> begin tran
2> update keystorage set lastkey=lastkey+1 where tablename="inserttest"
3> go
1> declare @lastkey int
2> select @lastkey = lastkey from keystorage where tablename="inserttest"
3> insert inserttest (testtext,counter) values ("nextline",@lastkey)
4> go
1> commit tran
2> go
There is plenty of room for error checking with this process: for example (code adapted
from Colm O'Reilly (colm@mail.lk.blackbird.ie)
post to Sybase-L 6/20/97):
1> begin tran
2> update keystorage set lastkey=lastkey+1 where tablename="inserttest"
3> if @@rowcount=1
4> begin
5> declare @lastkey int
6> select @lastkey=lastkey from keystorage where tablename="inserttest"
7> end
8> commit tran
9> begin tran
10> if @lastkey is not null
11> begin
12> insert inserttest (testtext,counter) values ("third line",@lastkey)
13> end
14> commit tran
15> go
This provides a pretty failsafe method of guaranteeing the success of the select
statements involved in the process. You still have a couple of implementation decisions
though:
- One transaction or Two? The above example uses two transactions to complete the task;
one to update the keystorage and one to insert the new data. Using two transactions
reduces the amount of time the lock is held on keystorage and thus is better for high
insertion applications. However, the two transaction method opens up the possibility that
the first transaction will commit and the second will roll back, leaving a gap in the
sequential numbers. (of course, this gap is small potatoes compared to the gaps that occur
in Identity values). Using one transaction (deleting lines 8 and 9 in the SQL above) will
guarantee absolutely no gaps in the values, but will lock the keystorage table longer,
reducing concurrency in high insert applications.
- Update first or select first? The examples given generally update the keystorage table
first, THEN select the new value. Performing the select first (you will have to rework the
creation scheme slightly; by selecting first you're actually getting the NEXT key to add,
where as by updating first, the keystorage table actually holds the LAST key added) you
allow the application to continue processing while it waits for the update lock on the
table. However, performing the update first guarantees uniqueness (selects are not
exclusive).
Some DBAs experienced with this keystorage table method warn of large amounts of blocking
in high insert activity situations, a potential drawback.
- Method 4: Enhance the above method by creating an insert trigger on your inserttest
table that performs the next-key obtainment logic. Or you could create an insert trigger
on keystorage which updates the table and obtains your value for you. Integrating the
trigger logic to your application might make this approach more complex. Also, because of
the nature of the trigger you'll have to define the sequence number columns as allowing
NULL values (a bad thing if you're depending on the sequential number as your primary
key). Plus, triggers will slow the operation down because after obtaining the new value
via trigger, you'll have to issue an extra update command to insert the rest of your table
values.
- Method 5: (Thanks to John Drevicky (jdrevicky@tca-techsys.com))
The following procedure is offered as another example of updating and returning the Next
Sequential Key, with an option that allows automatic reuse of numbers...... -----------------------------------------------------------------
----
--
DECLARE @sql_err int, @sql_count int
--
begin tran
--
select @out_seq = 0
--
UPDATE NEXT_SEQUENCE
SET next_seq_id
= ( next_seq_id
* ( sign(1 + sign(max_seq_id - next_seq_id) ) -- evaluates: 0 [when
-- next > max]; else 1
* sign(max_seq_id - next_seq_id) -- evaluates: 0 [when next = max];
-- 1 [next < max];
-- -1 [next > max]
) -- both evaluate to 1 when next < max
) + 1 -- increment by [or restart at] 1
WHERE seq_type = @in_seq_type
--
select @sql_err = @@error, @sql_count = @@rowcount
--
IF @sql_err = 0 and @sql_count = 1
BEGIN
select @out_seq = next_seq_id
from NEXT_SEQUENCE
where seq_type = @in_seq_type
--
commit tran
return 0
END
ELSE
BEGIN
RAISERROR 44999 'Error %1! returned from proc derive_next_sequence...no update occurred', @sql_err
rollback tran
END
- Other Methods: there are several other implementation alternatives
available that involve more complex logic but which might be good
solutions. One example has a central table that stores pre-inserted
sequential numbers that are deleted as they're inserted into the
production rows. This method allows the sequence numbers to be
recycled if their associated row is deleted from the production
table. An interesting solution was posted to Sybase-L 6/20/97 by
Matt Townsend (
mtowns@concentric.net) and
is based on the millisecond field of the date/time stamp. His
solution guarantees uniqueness without any surrogate tables or
extra inserts/updates, and is a superior performing solution to
other methods described here (including Identities), but cannot
support exact sequential numbers. Some other solutions are covered
in a white paper available at Sybase's Technical library discussing
Sequential Keys (this will open in a new browser
window).
Back to start of 6.2.9
Optimizing your home grown Sequential key
generating process for any version of Sybase
- max_rows_per_page/fillfactor/table padding to simulate row level locking: This is
the most important tuning mechanism when creating a hand -made sequence key generation
scheme. Because of Sybase's page level locking mechanism, your concurrency performance in
higher-insert activity situations could be destroyed unless the server only grabs one row
at a time. However since Sybase doesn't currently have row-level locking, we simulate
row-level locking by creating our tables in such a way as to guarantee one row per 2048
byte page.
- For pre-System 11 servers; Calculate the size of your rows, then create dummy fields in
the table that get populated with junk but which guarantee the size of the row will fill
an entire page. For example (code borrowed from Gary Meyer's 5/8/94 ISUG presentation (
gmeyer@netcom.com)):
1> create table keystorage
2> (tablename varchar(25),
3> lastkey int,
4> filler1 char(255) not null,
5> filler2 char(255) not null,
6> filler3 char(255) not null,
7> filler4 char(255) not null,
8> filler5 char(255) not null,
9> filler6 char(255) not null,
9> filler7 char(255) not null)
10> with fillfactor = 100
11> go
We use 7 char(255) fields to pad our small table. We also specify the fillfactor create
table option to be 100. A fillfactor of 100 tells the server to completely fill every data
page. Now, during your initial insertion of a line of data, do this:
1> insert into keystorage
2> (tablename,lastkey,
3> filler1,filler2,filler3,filler4,filler5,filler6,filler7)
4> values
5> ("yourtable",0,
6> replicate("x",250),replicate("x",250),
7> replicate("x",250),replicate("x",250),
8> replicate("x",250),replicate("x",250),
9> replicate("x",250))
10> go
This pads the row with 1750 bytes of junk, almost guaranteeing that, given a row's byte
size limit of 1962 bytes (a row cannot span more than one page, thus the 2048 page size
minus server overhead == 1962), we will be able to simulate row level locking.
In Sybase 11, a new create table option was introduced: max_rows_per_page. It automates
the manual procedures above and guarantees at a system level what we need to achieve; one
row per page.
1> create table keystorage
2> (tablename varchar(25),
3> lastkey int)
4> with max_rows_per_page = 1
5> go
Create unique clustered indexes on the tablename/entity name within your
keystorage table. This can only improve its performance. Remember to set max_rows_per_page
or the fillfactor on your clustered index, as clustered indexes physically reorder the
data.
Break up the process into multiple transactions wherever possible; this will
reduce the amount of time any table lock is held and will increase concurrency in high
insertion environments.
Use Stored Procedures: Put the SQL commands that update the keystorage table and
then insert the updated key value into a stored procedure. Stored procedures are generally
faster than individual SQL statements in your code because procedures are pre-compiled and
have optimization plans for index usage stored in Sybase's system tables.
Enhance the keystorage table to contain a fully qualified table name as opposed
to just the tablename. This can be done by adding fields to the table definition or by
just expanding the entity name varchar field definition. Then place the keystorage table
in a central location/common database that applications share. This will eliminate
multiple keystorage tables but might add length to queries (since you have to do
cross-database queries to obtain the next key). - There is an excellent discussion
located in the whitepapers section of Sybase's home page discussing the performance and
tuning aspects of any type of Sequential key use. It supplements the information here
(note: this page will open in a new browser window).
Back to start of 6.2.9
Back to top
Adaptive Server Enterprise: System 12
ASE 12 supports dynamic SQL, allowing the following:
declare @sqlstring varchar(255)
select @sqlstring = "select count(*) from master..sysobjects"
exec (@sqlstring)
go
Adaptive Server Enterprise: 11.5 and 11.9
There is a neat trick that was reported first by Bret Halford (
bret@sybase.com
). (If anyone knows
better, point me to the proof and I will change this!) It utilises
the CIS features of Sybase ASE.
- Firstly define your local server to be a remote server using
sp_addserver LOCALSRV,sql_server[,INTERFACENAME]
go
- Enable CIS
sp_configure "enable cis",1
go
- Finally, use sp_remotesql, sending the sql to the server defined in point 1.
declare @sqlstring varchar(255)
select @sqlstring = "select count(*) from master..sysobjects"
sp_remotesql LOCALSRV,@sqlstring
go
Remember to ensure that all of the databases referred to in the SQL string are fully
qualified since the call to sp_remotesql places you back in your default database.
Sybase ASE (4.9.x, 10.x and 11.x before 11.5)
Before System 11.5 there was no real way to execute dynamic SQL. Rob Verschoor
has some very neat ideas that fills some of the gaps (http://www.euronet.nl/~syp_rob/dynsql.html).
Dynamic Stored Procedure Execution
With System 10, Sybase introduced the ability to execute a stored
procedure dynamically.
declare @sqlstring varchar(255)
select @sqlstring = "sp_who"
exec @sqlstring
go
For some reason Sybase chose never to document this feature.
Obviously all of this is talking about executing dynamic SQL within the
server itself ie stored procedures and triggers. Dynamic SQL within
client apps is a different matter altogether.
Back to top
It was possible to concatenate a
series of strings to return a single column, in a sort of analogous manner
to sum summing all of the numbers in a column.
However, this was not a 'feature' but a bug, so if you are running an EBF
that has the fix for CR 210688, this is no longer possible.
(Obsolete:) Use a case statement, a la,
1> declare @string_var varchar(255)
2>
3> select @string_var = ""
4>
5> select @string_var = @string_var +
6> (case 1 when 1
7> then char_col
8> end)
9> from tbl_a
10>
11> print "%1!", @string_var
12> go
(1 row affected)
ABCDEFGH
(8 rows affected)
1> select * from tbl_a
2> go
char_col
--------
A
B
C
D
E
F
G
H
(8 rows affected)
1>
Back to top
Sybase does not have a direct equivalent to Oracle's rownum but its
functionality can be emulated in a lot of cases.
If you are simply trying to retrieve the first N rows of a table, then simple
use:
set rowcount
replacing <N> with your desired number of rows. (set
rowcount 0 restores normality.) If it is simply the last N rows, then use
a descending order-by clause in the select.
1> set rowcount
2> go
1> select foo
2> from bar
3> order by barID desc
4> go
If you are trying to retrieve rows 100 to 150, say, from a table in a
given order. You could use this to retrieve rows for a
set of web pages, but there are probably more efficient ways using cursors
or well written queries or even Sybperl! The general idea is select
the rows into a temporary table adding an identity column at the same
time. Only select enough rows to do the job using the rowcount
trick. Finally, return the rows from the temporary table where the
identity column is between 100 and 150. Something like this:
set rowcount 150
select pseudo_key = identity(3),
col1,
col2
into #tempA
from masterTable
where clause...
order by 2,3
select col1,col2 from #tempA where pseudo_key between 100 and 150
Remember to reset rowcount back to 0 before issuing any more SQL or you
will only get back 150 rows!
A small optimisation would be to select only the key columns for the
source table together with the identity key. Once you have the set of rows
you require in the temporary table, join this back to the source using the
key columns to get any data that you require.
An alternative, which might be better if you needed to join back to this
table a lot, would be to insert enough rows to cover the range as before,
but then delete the set of unwanted rows. This would be a very efficient
mechanism if the majority of your queries involved the first few rows of a
table. A typical application for this might be a search engine displaying
relevant items first. The chances are that the user is going to be bored
after the first couple of pages and go back to playing 'Internet Doom'.
set rowcount 150
select col1,
col2
into #tempA
from masterTable
where clause...
set rowcount 100
delete #tempA
Sybase does not guarantee to return rows in any particular order, so the
delete may not delete the correct set of rows. In the above
example, you should add an order-by to the 'select' and build a clustered
index on a suitable key in the temporary table.
The following stored proc was posted to the Sybase-L mailing list and
uses yet another mechanism. You should check that it works as expected in
your environment since it relies on the fact a variable will be set using
the last row that is returned from a result set. This is not published
behaviour and is not guaranteed by Sybase.
CREATE PROCEDURE dbo.sp_get_posts
@perpage INT,
@pagenumber INT
WITH RECOMPILE
AS
-- if we're on the first page no need to go through the @postid push
IF @pagenumber = 1
BEGIN
SET ROWCOUNT @perpage
SELECT ...
RETURN
END
-- otherwise
DECLARE @min_postid NUMERIC( 8, 0 ),
@position INT
SELECT @position = @perpage * ( @pagenumber - 1 ) + 1
SET ROWCOUNT @position
-- What happens here is it will select through the rows
-- and order the whole set.
-- It will stop push postid into @min_postid until it hits
-- ROWCOUNT and does this out of the ordered set (a work
-- table).
SELECT @min_postid = postid
FROM post
WHERE ...
ORDER BY postid ASC
SET ROWCOUNT @perpage
-- we know where we want to go (say the 28th post in a set of 50).
SELECT ...
FROM post
WHERE postid >= @min_postid
...
ORDER BY postid ASC
Yet another solution would be to use a loop and a counter. Probably the
least elegant, but again, it would depend on what you were trying to do as
to what would be most appropriate.
As you can see, none of these are particularly pretty. If you know of a
better method, please forward it to dowen@midsomer.org.
Back to top
This question is certainly not rocket science, but it is often nice to
know how many rows are returned as part of a group by. This might be for a
report or a web query, where you would want to tell the user how many rows
were returned on page one. It is easy using a temp table, but how to do it
without a temp table is a little harder. I liked this solution and
thought that it might not be obvious to everyone, it was certainly
educational to me. Thanks go to Karl Jost for a very nice answer.
So, give data like:
name item
---- ----
Brown 1
Smith 2
Brown 5
Jones 7
you wish to return a result set of the form:
name sum(item) rows
---- --------- ----
Brown 6 3
Jones 7 3
Smith 2 3
rather than
name sum(item) rows
---- --------- ----
Brown 6 2
Jones 7 1
Smith 2 1
Use the following, beguilingly simple query:
select name, sum(item), sum(sign(count(*)))
from data
group by name
Back to top
|