Title: The need for speed
1The need for speed
- SharePoint Optimisation and recovery from a
database perspective
By Tom Bizannes Microsoft Certified
Professional MacroView Business Technology Level
12, 171 Clarence St, Sydney, Australia 612 9249
2700
2Tom Bizannes
- A Database Guy caught in
- the Land of the Lost SharePoint
-
- Working with the leading
- Document Management Software
- for SharePoint (WISDOM)
3Why
- Because Microsoft Released a white paper on
Optimising SharePoint which verified many of our
findings see reference at the end
4What we will cover
- Brief Overview of SharePoint Infrastructure
- SharePoint Recovery - Backing up / Restoring
- Optimizing the need for speed!
- Extras
- Integrating Reporting Services 2008 on IIS7
- Reporting Services and SharePoint Lists
5The Big Picture
6Recovery
- Full Recovery
- SharePoint Central Admin
- Stsadm and psconfig
- Sql Server
- Other Tools
- Granular Recovery
- Stsadm
- Versioning
- Recycle Bin Did you know it has two levels?
- Third Party Tools
7Backing Up
- For restoring from failure or corruption
- Sql Server
- SharePoint Central Admistration
- Windows and stsadmin..via scripts.
- SharePoint Designer
- Third Party Tools
- For item level restore
- Recycle Bin and Versioning
- Stsadm scripts
- Third Party Tools
8Normal Sql Server and Windows
- If MOSS 2007 then backup the config database as
well. - There is a fix if you forgot to backup the
configbut you will pull your hair out first! - Know what service pack and hot fix you are up to
with SharePoint (Schema related) - With the recycle bin and versioning, you may not
need a more granular backup..
9Windows and stsadmin..
- Can get clever with scripts to backup various
sites. - More granular if you have lots of scripts against
different sites. - SharePoint Designer creates the same sort of
backup as stsadmin but needs to be run manually.
10Back Up and Restore by Using Stsadm
- When you back up by using the Stsadm command-line
tool, you can back up individual aspects of your
SharePoint Products and Technologies deployment.
For example, you can back up an individual site
collection or you can back up the entire farm. - To export sites from your SharePoint Products and
Technologies deployment, you use the following
Stsadm command. - Stsadm o export url ltURLgt -filename
ltFileNamegt.cmp - To back up a site collection, you must use the
following stsadm command. - stsadm o backup url ltURL of the site
collectiongt -filename ltName of the backup filegt - To back up an individual database, Web
application, or the entire farm, you can use the
following Stsadm command. - stsadm o backup directory ltUNC path or local
drivegt -backupmethod ltFull or Differentialgt -
- Restore
- To import sites to your SharePoint Products and
Technologies deployment, you use the following
Stsadm command. - Stsadm o import url ltURLgt -filename
ltFileNamegt.cmp - To restore a site collection, you must use the
following Stsadm command. - stsadm o restore url ltURL of the site
collectiongt -filename ltName of the backup filegt - To restore an entire farm you can use the
following Stsadm command. - stsadm o restore directory ltUNC path or local
drivegt -restoremethod ltoverwrite or newgt
11Third party tools
- EMC Backup Manager for SharePoint
- Uses the sql backup to restore a file or site etc
- Microsoft Data Protection Management
- Can do incremental backups every 15 minutes etc
- Doc Avenue
- Great for restoring a site or document library
- Very granular
- Idera Point Backup
12Tuning
- Indexes
- IIS
- Database
- Fragmentation
- Database Files
13Need to work together
- The SharePoint administrator needs to work
together with - The Microsoft Sql Server administrator
- and the Windows Network administrator
14What do you need to know and what do you need to
get the others to do?
15Typical scenarios
- 2 million documents, 2 Terabytes of data, 400
staff in head office, 50 in 4 remote branches - 100,000 documents, 20 staff, 100Mb of documents
and growing, one branch with 100 external clients
logging in - 20 millions documents, 20 terabytes of data, 4
countries and 50 branches around the world
16Our analysis
- Databases get fragmented very quickly
- Searching is the biggest CPU hog
- Virtualising SharePoint is great but beware of
Virtualising Sql Server .
17Tuning - What you need to know?
- IIS Settings
- Setting indexed columns
- The structure of the SharePoint implementation
Good planning and structure makes SharePoint more
responsive - Database tuning very important
18Optimise your Search Indexer
- There are many factors involved in the SharePoint
crawling process that can impact indexing
performance. There are also some steps you can
take to improve that. Here are the common causes
and their resolution - Indexing Performance is set at reduced - common
mistake on the configuration screen for the index
service. See Central Administration gt Operations
gt Services on Server gt Office SharePoint Server
Search Service Settings and set to Maximum. - Number of Connections - by default the indexer
will run a limited number of simultaneous threads
(6 usually). This can be increased manually by
adding specific Crawler Impact Rules for each
host. You can really improve speed by setting a
large file server to 32 connections. But watch
your network for bottlenecks. - Crawled systems are slow or hosted on remote
networks. - not a lot to be done here, except by
moving those file closer. - Overlapping Crawls - SharePoint gives priority to
the first running crawl so that if you already
are indexing one system it will hold up the
indexing of a second and increase crawl times. - Solutions Schedule your crawl times so there is
no overlap. Full crawls will take the longest so
run those exclusively. - IFilter Issues - the Adobe PDF IFilter can only
filter one file at a time and that will slow
crawls down. - Using a PDF filter from pdflib.com or Foxit
- Not enought Memory Allocated to Filter Process -
you can increase the memory allocation by
adjusting the following registry keys - HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office
Server\12.0\Search\Global\Gathering Manager set
DedicatedFilterProcessMemoryQuota 200000000
Decimal - HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office
Server\12.0\Search\Global\Gathering Manager set
FilterProcessMemoryQuota 200000000 Decimal - Bad File Retries - there is a setting in the
registry that controls the number of times a file
is retried on error. This will severly slow down
incremental crawls as the default is 100. This
retry count can be adjust by this key - HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office
Server\12.0\Search\Global\Gathering Manager set
DeleteOnErrorInterval 4 Decimal - General Architecture Issues - Ensure that you
have at least 2 Gig of free memory available
before your crawl even starts and that you have
at least 2 real processors available - Disk Health - the nature of the indexing process
causes extensive fragmentation of the file system
for both the index server and the database
server. Schedule defrags routinely and after all
full crawls. Ensure you have enough diskspace
always. - Run 64 bit OS - school is still out on this one,
i personally haven't seen must difference as long
as there is enough memory and the same processor
types, but MS recommends this for large
deployments.
19More questions for the SharePoint Administrators
- Versioning
- This impacts on the size, but makes restoring
less of an issue - Can delete minor versions and do all sorts of
other things if the database gets too big. - Auditing
- This can grow out of control
- Have you set indexed columns on large document
libraries for performance? - How many site collections do you have and how big
will they grow? - Can set a separate database for each one
- E.g. might have historical data on one etc
- Maybe a site collection for clients A to M and
another for clients N to Z
20Indexed Columns in a SharePoint Library or List
- every list item that has an indexed column will
create a new row in the NameValuePair table.(The
NameValuePair_Latin1_General_CI_AS table for
the English Language version of SharePoint) - The table also needs to be updated whenever
indexed columns change their value or when items
are added or deleted. E.g. Having a list with
100000 items and an index column for instance
means that you also add 100000 items to
the NameValuePair table. And - every time you
update your list - NameValuePair need to be
updated as well.
21The Databases
- Main Database is Wss_content
- Know what service packs and hot fixes have been
applied if you want to restore! - Set a 100G limit per site collection? (This is
as per the Microsoft Whitepaper ) - Someone suggested 300G is the new 100G limit
- The limit all depends on your infrastructure
22What version are you running?
Go into Control Panel and show the version field
23Run this against your Sql Database
- SELECT VersionId ,Version
,Id ,UserName ,TimeStamp
,FinalizeTimeStamp ,Mode
,ModeStack ,Updates ,Notes
FROM SharePoint_Config.dbo.Versions
WHERE VersionId '00000000-0000-0000-0000-0000000
00000' ORDER BY Id DESC
24Relevant Versions for V3 / 2007
- Using SharePoint Central Administration Web site
SharePoint HTML Site Settings admin pages or IIS
Manager, on the web sites properties HTTP Headers
tab, virtual servers once extended will show the
following version numbers - Service Pack 2 - 12.0.0.6421
- MOSS 20071 or WSS 3.0 Cumulative update
(KB956056 KB956057) 12.0.0.6327MOSS 20071 or
WSS 3.0 Infrastructure Update (KB951695
KB951297) 12.0.0.6318MOSS 20071 or WSS
3.0 SP1
12.0.0.6219 MOSS 20071 or WSS 3.0 RTM
12.0.0.4518
25Always backup your Content Store
- The main database is wss_content
- You can create a content store per site
collection. E.g. wss_content_hr,
wss_content_admin etc - You can get your SharePoint Administrator to set
up an alert via SharePoint Administration when it
gets close to the capacity you set.
26Other networking things
- Enable IIS Compression
- As a general best practice, the SharePoint
Operations team verified that IIS compression for
static content was enabled on front-end servers.
Enabling static compression is especially helpful
for serving content to users over slower links - Get fast Network cards on all servers (1Gb)
27Service packs and hot fixes
- Searching had a few bugs that were fixed with
Service Pack 1 - The crawler never finished if indexes were
rebuilt - Service Pack 2 had even more fixes for searching
- Always note what service pack and hot fix you are
up to as you will need these if you ever restore
or the schemas will cause issues etc
28Optimising the databases
- Set a fill factor on all indexes to 70
- Q Where do you set a default fill factor?
- Use indexed columns on large document libraries
for performance (For your SharePoint Guy) - Create different site collections and/or document
libraries on different content databases - Separate Databases see following
29Separate Tempdb first
- Office SharePoint Server 2007 farm performance
can be significantly impeded by insufficient disk
I/O for the tempdb. To avoid this issue, allocate
dedicated disks for the tempdb. If a high
workload is projected or monitored that is, the
average read operation or the average write
operation requires more than 20 milliseconds
(ms) you might need to ease the bottleneck by
either separating the files across disks, or by
replacing your disks with faster disks. - For best performance, place the tempdb on a RAID
10 array. The number of tempdb data files should
equal the number of core CPUs, and the tempdb
data files should be set at an equal size. Count
dual core processors as two CPUs for this
purpose. Count each processor that supports
hyper-threading as a single CPU.
30Separate and prioritize your data among disks
- When prioritizing data among faster disks, use
the following ranking - Tempdb data and transaction logs
- Database transaction log files
- Search database
- Database data files
- In a heavily read-oriented portal site,
prioritize data over logs. - Ideally, place the tempdb, content databases, and
SQL Server 2005 transaction logs on separate
physical hard disks.
31Separate and prioritize your data among disks -
continued
- Only create files in the primary filegroup for
the database. - Distribute the files across separate disks.
- Create data files of equal size.
- Separate database data and transaction log file
across different disks. If files must share disks
because the files are too small to warrant a
whole disk or stripe or you have a shortage of
disk space, put files that have different usage
patterns on the same disk to minimize
simultaneous access requests. - Consult your storage hardware vendor for
information about how to configure all logs and
the search databases for write optimization for
your particular storage solution. - Allocate dedicated spindles for the search
database.
32Microsofts Conclusions p1
- Best practices for the front end include
- Run IIS version 7.0 on 64-bit servers Memory
and CPU are common performance optimization
factors for SharePoint Server. Using 64-bit
hardware increases the amount of usable memory,
which helps to maintain a healthy system state
for worker processes. - Use a front-end and back-end NIC configuration
for IIS During peak load times, as many people
access SharePoint sites, the NIC traffic
increases. Using dedicated NICs for connections
to the SQL Server back end and the clients
provides better load distribution. Using
dedicated NICs also provides more-accurate
statistics and helps with troubleshooting traffic
congestion issues by segregating the front-end
and back-end traffic. - Load balance client traffic The SharePoint
Operations team uses NLB for balancing client
traffic. It is a best practice to load balance
incoming traffic for optimal user experience and
server utilization. - Use IIS compression for static content The
SharePoint Operations team ensures that static
compression is enabled to conserve traffic and
server resources. - Enable caching Page output caching on front-end
servers reduces CPU utilization on front-end
servers by storing compiled ASP.NET pages in RAM.
Enabling this setting resulted in performance
gains for the SharePoint Operations team. BLOB
caching helps to relieve load on back-end servers
by caching static content and not accessing
databases when it is requested.
33Microsofts Conclusions p2
- Best practices for the back end include
- Limit database size to enhance manageability Whe
n databases grow, they can become less manageable
for backup and restore operations, or for
troubleshooting. The SharePoint Operations team
uses a 100-GB limit. - Allocate storage for versioning and the recycle
bin When designing the environment, an
organization should consider business needs, such
as versioning, and ensure that adequate disk
space and I/O are available to accommodate them. - Use quota templates to manage storage Microsoft
IT uses standardized configuration templates in
all possible and practical scenarios, including
quotas. Using quota templates helps preserve a
standard environment, which reduces
administrative overhead. - Manage large lists for performance Having large
lists by itself is not necessarily a performance
issue. When SharePoint Server renders the many
items in those lists, that can cause spikes in
render times and database blocking. One way to
mitigate large lists is to use subfolders and
create a hierarchical structure where each folder
or subfolder has no more than 3,000 items. - Separate and prioritize data among disks and
create disk groups for specific data Because
available disk I/O throughput is so important for
optimal SQL Server performance, identifying the
read/write patterns of services and dedicating
SAN LUNs to them results in better performance
than using many service types with the same disk
group. The SharePoint Operations team takes this
idea a step farther and uses dedicated partitions
for data.
34The tools Microsoft talks about
- Event Viewer This tool is especially useful for
understanding the underlying behavior by
evaluating application errors and warnings, or
investigating system events that occur before,
during, and after a performance incident. - Dump file analysis Analyzing dump files is an
advanced troubleshooting and analysis approach
that provides low-level information about
critical system errors and memory dumps. It
enables the SharePoint Operations team to examine
the data in memory and analyze the possible
causes of such issues as memory leaks and invalid
pointers. - System Monitor The SharePoint Operations team
uses tools such as Event Viewer and dump file
analysis to investigate specific incidents and
performance issues. The team uses System Monitor
in the Windows Server 2003 operating system
(called Performance Monitor in Windows
Server 2008) for establishing a performance
baseline, tracking trends, and compiling data on
resulting performance after making changes. - SQL Server Profiler This tool is a graphical
user interface to SQL Trace for monitoring an
instance of SQL Server Database Engine or SQL
Server Analysis Services. Microsoft IT and other
teams use this tool to evaluate SQL Server
performance aspects such as query times, stored
procedure run times, and deadlocks. This tool is
especially useful for analyzing the underlying
calls to SQL Server databases that are housed on
the storage area network (SAN). - Custom tool for client-based URL ping The
SharePoint Operations team created a custom tool
that recorded the time to first byte for URLs
hosted on SharePoint servers. This is one of the
most useful tools because it enables the
comparison of statistics before and after
implementing configuration changes to the
environment. - Log Parser The SharePoint Operations team uses
logging extensively when determining root causes
of issues, including SharePoint trace logs and
IIS and Unified Logging Service (ULS) application
and service logs. Microsoft IT uses Log Parser as
one of the tools to monitor traffic, determine
traffic sources distribution, and establish
performance baselines. This free tool parses IIS
logs, event logs, and many other kinds of
structured data by using syntax similar to
Structured Query Language (SQL). For more
information about Log Parser, refer to the Script
Center resource at http//www.microsoft.com/techne
t/scriptcenter/tools/logparser/default.mspx. - Fiddler This tool is helpful for measuring
caching, page sizes, authentication, and general
performance issues. For more information, visit
the Fiddler Web site at http//www.fiddler2.com/fi
ddler2/.
35Tools for monitoring
- Performance Dashboard for Microsoft SQL Server
2005( Nice set of Reports) - (note update for sql 2008)
- Your own scripts (watch my blog for new ones)
- SQL Server 2008 - Performance Studio
- Beware of how large the database can grow
- Great for benchmarking
- Sql 2008 activity Monitor
36Summary
- 90 of it is the Database!
- Map out what is happening
- Check the fill factor
- Split databases for performance (Tempdb, Logs,
Wss_Content, etc) - Have good benchmarking / reporting in place
- Check the growth and cpu usage
- Work together with all parties
- Plan and keep on planning
- Check out some tools to make life easier
37References
- SharePoint Performance Optimization How
Microsoft IT Increases Availability and Decreases
Rendering Time of SharePoint Sites, which
discusses how Microsoft IT discovered
opportunities to enhance SharePoint
optimization. - Manage lists and libraries with many items
- Performance Recommendations and Best Practices
- Data protection and recovery for Microsoft
Office SharePoint Server 2007 - The Company I work for
- Document Management for SharePoint