Bring the suspect pages or database on line. You can print a list of
pages or databases that are suspect after recovery using the
sp_listsuspect_db and sp_listsuspect_page stored procedures:
sp_listsuspect_db
sp_listsuspect_page [dbname]
You can bring these pages or database online using the
sp_forceonline_db or sp_forceonline_page stored procedures:
sp_forceonline_db dbname
{"sa_on" | "sa_off" | "all_users"}
sp_forceonline_page dbname, pagenumber
{"sa_on" | "sa_off" | "all_users"}
sa_on and sa_off toggle the database or page online and offline, and
allow access to the database or page only to those with the sa_role set
on. This permits the DBA to examine and repair the suspect database or
pages without other users being able to access them.
WARNING! The all_users option is irreversible and makes the database or
page available to all users. If no repairs have been made, this
may result in some level of logical inconsistency.
SUICIDING THE LOG
The new RFI feature in 11.5 eliminates most of the need for suiciding the
log. The two most common reasons for suiciding the log in the past were:
- No backups are available or the backups are too old.
- Insufficient time to restore.
There should never, of course, be a situation where backups are unavailable
or are too old. Unfortunately that is too often the case, either because the
dumps are bad or due to poor planning. In such situations, suiciding the log
may be the only recourse. Aside from those situations however, you should
never consider suiciding a viable option.
Note: DBA's should test all backup and restore procedures before relying
on them. If you attempt to load a dump on the original database and
it completes only partially, you will have eliminated the
possibility of using that database again and may even have
eliminated the final chance to recover data by suiciding the log.
RFI Example
Here is an example of recovery using RFI's page level features:
During recovery of a database, five pages were marked suspect. The DBA
examined the pages and determined that three of them are index pages on a
single allpages--locked (APL) table, and that the other two marked suspect
are data pages belonging to different tables. The database has been marked
as read_only and while users can query the database, no changes can
currently take place.
First the DBA onlines the pages with the sa_on option. The DBA then
immediately dumps the transaction log to ensure the ability to recover to
this point should something else go wrong. Recovery would involve loading a
database dump and all subsequent transaction dumps.
Before RFI, a dump of a suspect database was not possible. With RFI, the DBA
can make a dump of the slightly corrupt database in case it is needed later.
Often a recent dump with a few problems is preferable to an older dump with
no problems. This is purely a safety measure as the DBA hopes to be able to
repair the database, which is currently partly unrecovered.
Next the DBA runs dbcc indexalloc on the index containing the three offline
pages. indexalloc reveals errors, and it is decided that the best thing to
do is to rebuild the index. If the index was a nonclustered index, or a
data-only-locked (DOL) clustered (placement) index, it could simply be
dropped and recreated. However, this is an APL clustered index and any time
the clustered index is suspect, the table is suspect as well. The DBA runs
dbcc checktable to examine the integrity of the data pages. dbcc checktable
always checks the data page linkage before checking the index structures.
(Keep in mind that a DOL table's data and non-leaf index pages do not have
maintained sibling links that can be followed by dbcc pglinkage type of
checks.) By looking at the output of dbcc checktable, the DBA determines
that the data page linkage is intact. This means that it is safe to drop the
clustered index.
Note: If the data page linkage also showed corruption, the DBA would have
to resort to backups or find another way to restore the table (an
offline bcp copy, for example).
Looking at the data pages for the other two objects, it is found that the
first object is a static reference table, and an offline copy of this
table's data exists. The DBA decides to truncate the table and bcp in a new
copy. For the second object, an APL clustered table, the data page linkage
is found to be broken, but the clustered index is still intact. With this
information the DBA is able to locate all of the rows, bcp them out,
truncate the table and bcp them back in.
Once all of these tasks are complete, the question of possible incomplete
logical changes to the tables due to incomplete transactions still remains.
The only way to test for data integrity is to use user-written queries and
reports that expose inconsistencies. After doing this, the DBA can determine
if those inconsistencies can be tolerated, or repaired, or if backups are
the best option.
The final step is to detect and fix any allocation inconsistencies that may
exist due to recovery having only partially completed. The DBA can run dbcc
checkalloc to check the entire database, or dbcc tablealloc and dbcc
indexalloc can be run on the suspect objects.
>From this example it is clear that Recovery Fault Isolation makes many more
choices available to the DBA. With database-only granularity, the DBA has no
way to examine the extent of the corruption and make a decision as to what
the best solution to the failed recovery might be.
Useful dbcc Commands
checker (dbcc) commands useful for diagnosing and troubleshooting Adaptive
Server problems.
Note: This information also appears in the 11.9.2 Error Messages and
Troubleshooting Guide. Unless otherwise specified, all references in
this TechNote are to sections of that document.
STANDARD dbcc COMMANDS
The standard, supported dbcc commands used in this document are as follows:
| dbcc | CommandPurpose
|
|
tablealloc | checks allocation information for the specified table.
|
|
textalloc | checks allocation information in text pages for the specified
table.
|
|
indexalloc | checks allocation information for the specified index.
|
|
checkalloc | runs the same checks as tablealloc, for all pages in a
database.
|
|
checktable | checks the integrity of data and index pages in the specified
table.
|
|
checkdb | runs the same checks as checktable, for all tables in a
database.
|
|
checkstorage | combines some of the checks of the above commands, and
provides additional checks. checks the integrity of indexes on user tables. prints a
|
|
reindex | message when it finds the first index error and then
drops/recreates the index.
|
For a complete description of these commands, see the System Administration
Guide.
OTHER dbcc COMMANDS
This document utilizes a number of additional dbcc commands. These commands,
listed below, are useful in specific troubleshooting situations to gather
information and diagnose problems.
WARNING!
These are undocumented and non-standard commands. Sybase
Technical Support does not support them for general use. Although
the command syntax is provided here for reference, you should use
these commands only to diagnose specific errors as described in
this document, and with the specific syntax shown in those
sections. Failure to do so could cause performance problems
and/or database corruption.
Command: dbcc page
Purpose: Prints the contents of a page within a database.
Usage:
dbcc page (dbid, pageno,[printopt [,cache [,logical [,cachename]]]])
or
dbcc page (dbname, pageno,[printopt [,cache [,logical [,cachename]]]])
'dbcc page' arguments
| dbid |
database ID |
| dbname |
database name
|
| pageno |
page number
|
| printopt |
| output format: |
0 |
print buffer and page header only (default) |
| 1 |
print buffer and page headers, rows and offset table |
| 2 |
print buffer and page headers, hex dump of data and offset table |
|
| cache |
| where to get the page: |
0 |
read page from disk |
| 1 |
read page from cache if present, otherwise read from disk (default) |
|
| logical |
| The page type |
0 |
pageno is a virtual page |
| 1 |
pageno is a logical page (default) |
|
| cachename |
|
Command: dbcc pglinkage
Purpose: checks the linkage of a page chain.
Usage:
dbcc pglinkage (dbid, start_pg_num, number_pages, printopt, search_for, search_order)
|
dbid
|
database ID
|
|
start_pg_num
|
page number at which to start checking
|
|
number_pages
|
|
the number of pages to check:
|
0
|
check all pages
|
|
|
printopt
|
|
denotes which pages to display:
|
0
|
display only the number of
pages checked
|
|
1
|
display the last 16 pages checked
|
|
2
|
display all the page numbers checked
|
|
|
search_for
|
stop checks when this page number is reached
|
|
search_order
|
|
Search direction:
|
0
|
follow previous page pointers
|
|
1
|
follow next page pointers
|
|
Command: dbcc log
Purpose: displays transaction log records.
Usage:
dbcc log (dbid, objid, pageno, rowno, nrecs, type, printopt)
|
dbid
|
Database ID
|
|
objid
|
can be <0, zero, or> 0. Meaning of this
option depends on the values of 'pageno'
and/or 'rowno'. For example, if objid >0 and
'pageno' and 'rowno' = 0, all records for
that object are displayed.
|
|
pageno
|
page number (or 0)
|
|
rowno
|
row number (or 0)
|
|
nrecs
|
number of records and log scan direction
|
|
type
|
Type of log record to display
|
|
printopt
|
|
denotes display options
|
0
|
display header and data
|
|
1
|
display header only
|
|
Command: dbcc traceflags
(available with 11.0.3 and later)
Purpose: Shows what traceflags, if any, are currently active in the server.
Usage:
dbcc traceflags
Command: dbcc traceon
Purpose: activates the specified trace flag.
Usage:
dbcc traceon (trace_flag)
Command: dbcc traceoff
Purpose: de-activates the specified trace flag.
Usage:
dbcc traceoff (trace_flag)
Command: dbcc memusage
Purpose: Shows memory allocation for server structures and objects, for
example the size and number of stored procedures.
Usage:
dbcc memusage
WARNING! Running dbcc memusage on a multi-engine server can cause the
other running processes to timeslice.
Avoiding Disaster with Good DBA Practices
Server Enterprise.
Note: This information also appears in the 11.9.2 Error Messages and
Troubleshooting Guide. Unless otherwise specified, all references in
this TechNote are to sections of that document.
SYSTEM ADMINISTRATION TIPS
The following practices can keep your Adaptive Server installation working
at peak effectiveness. By maintaining these good practices, you can maximize
server uptime, correct problems proactively, and be as prepared as possible
to handle emergencies.
- Keep Up-to-Date Backups
Maintaining current backups of your data is vital for any recovery
plan. Keep multiple generations of backups, and keep some offsite as an
extra precaution.
Make regular database dumps of:
- the master database. To insure that your backup of master is
always current, back up master after each maintenance command that
affects disks, storage, databases, or segments - for example,
after creating or deleting databases, initializing new devices,
and creating or modifying segments.
- the model database
- the sybsystemprocs database
- user databases.
- Maintain copies of System Tables and DDL
Keep the latest offline copies of the following tables:
- sysusages
- syslogins
- sysloginroles
- sysdatabases
- sysdevices
- syscharsets
- sysconfigures
- sysservers
- sysremotelogins
- sysresourcelimits (11.5 and later)
- systimeranges (11.5 and later)
Use the bcp utility to copy out these tables. In addition, maintain a
hardcopy by printing the output of the following queries:
select * from sysusages order by vstart
select * from sysusages order by dbid, lstart
select * from syslogins
select * from sysloginroles
select * from sysdatabases
select * from sysdevices
select * from syscharsets
select * from sysconfigures
select * from sysservers
select * from sysremotelogins
select * from sysresourcelimits (11.5 and later)
select * from systimeranges (11.5 and later)
Also maintain:
- copies of your configuration file.
- the first two blocks (2 pages) of the master device.
- a copy of the config block. You can generate this using Sybase
Central or Power Designer. On Unix platforms, you can obtain a
copy of the config block with this command:
dd if=master_device of=$SYBASE/config_block.bak bs=1024 count=8
- all Data Definition Language (DDL) scripts you use to create user
objects, specially stored procedures if you elect to use
sp_hidetext (11.5 and later).
Note: Implement all changes to schema in the same way that the
installmaster script is implemented.
- Verify Database Consistency
Run dbcc checks on a regular basis to monitor the health of your
databases. Database-wide checks are available with dbcc checkdb, dbcc
checkalloc, and dbcc checkstorage (11.5 and higher). dbcc checkcatalog
is also a useful tool. For a brief overview of dbcc commands, see
"Useful dbcc Commands". Detailed information appears in the System
Administration Guide.
Since dbcc checks can be resource intensive, consider adopting a
strategy to take advantage of object level dbcc's. On a given day run a
certain number of checktable and tablealloc commands for a portion of
the database. On subsequent days, run different tables. Over a period
of days you can accomplish a complete check of your databases for
integrity. For example if your database has 200 tables in addition to
the system tables, run dbcc's on the system tables on night one, run
dbcc's on each of the first 50 of the user tables on night two, the
next 50 the next night and so on, until at the end of five nights you
have checked every table in the database. On the sixth night you can
begin the cycle again.
Note: Running table-level dbcc's misses the GAM page checks.
Alternative strategies include:
- loading the database to another server, and running the dbcc's on
that server;
- dbcc checkstorage (11.5 and higher).
Building dbcc checks into your regular backup/maintenance schedule can
ensure that you have consistent, accurate backups available at all
times.
- Implement Mirroring
Mirroring, either at the Adaptive Server level or at the operating
system level, can provide nonstop recovery in the event of media
failure.
The factors you need to consider, and instructions on implementing
Adaptive Server mirroring, are detailed in the section titled
"Mirroring Database Devices" in the System Administration Guide.
- Perform Ongoing Maintenance
As part of a routine program of server maintenance, you should:
where date is the current date.
- Monitor the operating system log to keep an eye on the health of
the hardware and the server environment. Many Adaptive Server
errors can be due to underlying hardware problems, and can
therefore indicate hardware problems.
Refer to "Checking the Operating System Error Log" in
"Encyclopedia of Tasks" for information on how to locate your log
and how to check it.
- Monitor space usage with system procedures such as sp_helpsegment,
sp_spaceused, and sp_helpdb. By running sp_spaceused regularly,
for example, you can determine if a database is running out of
space for new objects.
Alternatively, you can set up thresholds to monitor free space on
database segments.
See "Getting Information About Database Storage" and "Creating
Threshold Procedures" in the System Administration Guide for
details.
- On Versions prior to 11.9.2, update index statistics. Distribution
pages hold statistics on the distribution of index key values. As
a table grows and changes, these statistics become old, and the
server may start to choose the wrong index strategy for queries.
You can address this condition by running update statistics
periodically.
Version 11.9.2 and later do not utilize distribution pages;
instead, they use a different mechanism for maintaining
statistics. Refer to New Functionality in Adaptive Server
Enterprise 11.9.2 for details.
- Avoid Risky Practices
- Avoid moving tempdb off the master device. When Adaptive Server is
installed, tempdb resides on the master device. Although it is
possible to move tempdb off the master device later for space
considerations, this is not advisable. Once tempdb is moved off
the master device, it is difficult and time-consuming to recover
if a problem occurs on the master device or the device to which
tempdb is moved.
- Never put anything other than master, model and tempdb on the
master device. Storing user databases on the master device may
make it difficult to recover the system databases or user
databases if either become damaged.
- Recovery Tips, Or What to do When Things Go Wrong
- Choose the correct recovery method. Your choice of methods will be
dictated by the type of failure you encounter. For example, loss
of a device will require restoring from backups.
Network/machine failure usually has little impact on the server
but could corrupt data in some situations, and recovery may fail.
- If mirroring is enabled at your site, disable the mirror before
loading a dump, thus preserving a copy of what you had before in
case dumps are bad.
- Never run buildmaster on the original master device. It may
contain information you need later. Instead, do a buildmaster on a
different device, and when your environment is completely
restored, you can move back to your original master device.
- Additional Tips
After an operating system upgrade, check permissions on your sybase
devices.
Other Useful Information
Browse the most recently released documents.
For frequent updates, visit the Sybase Technical Library on the World Wide
Web at http://techinfo.sybase.com.
Electronic Case Management and Electronic Software Delivery access remains
limited to Sybase support contacts who are registered with SupportPlus
Online Services. ECM and ESD requires SSL web browser support. ECM and ESD
users also may access the Technical Library from behind the firewall in the
usual manner, allowing them to retain the security benefits of using SSL
with their web browser.
Subscription Information
Send comments to technews@sybase.com. To receive this document by regular
email, send email to majordomo@sybase.com, and in the body of your message
put subscribe inews-technews-full. For a summary version by email, put
subscribe inews-technews-summary. No subject line is necessary.
Staff
Principal Editor: Betsy Brazy
Production Editors: Deborah Blakeney, Rodney Esperanza, Leigh Ann Hussey
Contributors: Vinaye Misra
Disclaimer
No express or implied warranty is made by Sybase or its subsidiaries with
regard to any recommendations or information presented in Sybase Technical
News. Sybase and its subsidiaries hereby disclaim any and all such
warranties, including without limitation any implied warranty of
merchantability of fitness for a particular purpose. In no event will Sybase
or its subsidiaries be liable for damages of any kind resulting from use of
any recommendations or information provided herein, including without
limitation loss of profits, loss or inaccuracy of data, or indirect special
incidental or consequential damages. Each user assumes the entire risk of
acting on or utilizing any item herein including the entire cost of all
necessary remedies.
If you ever want to remove yourself from this mailing list, you can send
mail to "majordomo@sybase.com" with the following command in the body of
your email message:
unsubscribe inews-technews-full
If you ever want to add yourself to the TechNews Summary mailing list, you can send email to "majordomo@sybase.com" with the following command in the body of your email message:
subscribe inews-technews-summary