Title: Active Directory Scalabilty
1Active Directory ScalabilityPierre Bijaoui
Micky Balladelli Solution Architects Applied
Microsoft Technologies GroupCompaq Services
2(No Transcript)
3Agenda
- Review of key business goals
- Principles
- Solutions
- Examples
- Best Practices
4Who Benefits?
- Enterprises
- Fewer, but larger DCs
- Consolidation
- Enterprise Identity
- PKI
- ISPs
- Telecom Operators
- Applications EXCHANGE 2000 BABY!
5What Do We Want To Scale?
- Entries
- Millions
- Servers
- Thousands
- Management
- Millions of users, but no or few admins
(auto-provisioning) - Clients
- Sustaining many concurrent queries
6Active Directory Fundamentals
- Domains are security boundaries
- Objects are stored in the Active Directory
- Schema is extensible
- Can contain millions of objects
- Hierarchically organized with OUs
- Multi-master replication between DCs
- Designs tend to go with fewer but larger DCs than
NT4 - Reduce system and management costs
- As well as network replication costs
7Active Directory Fundamentals
- Domains are linked by Kerberos trusts to form
Trees and Forests - AD is the sum of the domains in the forest
- Reasons for splitting domain are mostly business,
legal and political, rarely technical - GCs contain all objects of AD but a subset of
their attributes - Customizable in the Schema
8AD DatabaseThe heart of the system
- ESENT
- Jet Blue, derived from Exchange
- 8KB pages
- Extensible, but scalable
- 40M entries in a single database
- Set of LOGS and EDB (property database)
- Same scalability as Exchange databases
- Single file database!
9Relation To LSASS
ADSI
LDAP
MAPI
Other
Directory System Agent (DSA)
DB layer
Extensible Storage Engine (ESE)
NTDS .DIT
Log Files
EDB .CHK
TEMP .EDB
10 AD Write Operation
LSASS.EXE
Transaction
Write
11I/O Patterns For The ESE
- NTDS.DIT
- 8KB I/O Size
- 70-90 Read
- Async. Write
- Multi-Thread
- Random Access
- Log Area
- 8KB I/O Size
- Sync. Write
- Single Thread
- Sequential Access
NTDS.DIT
LSASS.EXE
Log Files
12Tests We Did
- Build 16M then 40M entries
- Same response time whatever the size
- 110M entries in the plans
- Replicated to a second DC
- Hardware used
- Strong storage platform
- Security
- Performance (I/O/s, not MB/s)
- Capacity (several dozens GB)
- Replicate to a second DC
13How We Did It
- Active Directory fully scriptable
- Scripts are easy to write
- But they are slow, and single-threaded
- ADSI can be used in a multi-threaded environment
- More adapted to LSASS
- Increases speed to a order of magnitude
14How We Did It
- Our Program
- Took the Multi-threaded approach with Visual C
- LDAP connection to DC via ADSI
- Input the number of top-level OUs
- Defines the number of threads
- Input the number of child OUs
- Input the prefix for names
- Randomly generate user and OU names
- Let run for a couple of weeks
15What Was Generated
- The 16 million test contained just users
- The idea is that a user requires a SID therefore
a read in the database - Exercises both reads and writes by several
hundred threads - The 40 million test contained both users and
contacts - Contacts are much cheaper than users
- About 1K versus 4K
16Keeping An Eye On The Storage
- 16 million users created
- Linear growth pattern
- 10,000 users 62.2 MB
- 100,000 users 454 MB
- 1,000,000 users 4,1 GB
- 10,000,000 users 41,9 GB
- 16,000,000 users 68,6 GB
- Performance 23 users / sec (now over 40 users /
sec)
17Performance FundamentalsRequest Rate versus
Response Time
250
200
150
Response Time (ms)
100
50
0
0
20
40
60
80
100
120
140
160
Request Rate (I/O/s)
18Drive Choices Affect Performance
19Fine Tuning The DIT Store
- Initial tests
- 20 CPU time
- Several milli-seconds per I/O
- Fine-tuning writeback cache
- Reduced the I/O bottleneck
- Ensure fast access to the LOG area
- Increased CPU usage to 70
- Add lots of spindles for the DIT volume
20Fine Tuning Replication
- Fast LAN speed between DCs (100Mb)
- By default a DC is optimized to reduce network
utilization - Following can be optimized
- Packet sizes
- Number of packets
- Priority to increase replication
(Multi-processing) - Latency for notification of
- New modification
- Notification of next replication partner
21Systems Used
- AlphaServers 4100, 4 CPUs 2GB or 4GB RAM (this is
a pretty old system) - Proliant 3000 PII 450 Mhz, 512 MB
- StorageWorks ESA 10000 per replica
- 300 GB for ntds.dit
- 100 GB log files
22OS Used
- Windows 2000 Beta 3 pre-RC0 IDS build
- For 16 million objects
- Windows 2000 Beta 3 RC1
- For 40 million objects
- Windows 2000 Beta3 RTM
- For large replicas
- Advanced Server
- Supports a cache of 1GB for LSASS
- Versus 512 MB for Server
- Looking forward to Data Center
23Management Scalability
- Auto-Provisioning
- Delegation
24Management ScalabilityAuto-Provisioning
- Goal reduce the administration cost for many,
many entries - How by letting the users do most of the work
- Account creation
- Account maintenance
- Complement with scripts (remove expired
accounts, etc)
25AD Auto-ProvisioningChallenges
- Information protection
- Can I modify all fields
- Can I see all fields
- Can I modify other entries
- User Interface
- Web is most practical
- ADSI from ASP
26Management ScalabilityDelegation
- Delegate entire branches
- Local data management (HR, Security, Telecom,
IM) - Protected entry area (attribute level ACL)
- Keep an overall consistent and useful picture
- For management
- For the users!
27GCs And DCsA Possible Solution
- Replication is at the attribute level
- Stop private attribute from getting public
- Use ACL on attribute entries
- Schema modification
- GCs isolate DCs
- DC is used for Enterprise operations
- GC support external user queries
- Many can be used to scale
- If one breaks, stick another one
- Does not contain sensitive data
28Large Number Of QueriesSpread the load
- Use multiple DCs to return the information
- Relevance of NLB, DNS round-robin, etc
- Isolate query traffic from entry management
- Assemble multiple DCs around a SAN
- Share a common, secure and high-performing
storage infrastructure
29Sample DesignsGC, DC and Firewall
GC
GC
GC
GC
GC
Public network
GC
FW
DMZ
DC1
DC2
DCn
Private network
30Sample DesignsDC Farm
Systems
DC1
DC2
DCn
FC-BasedSAN
Storage 1
Storage 2
Storage n
Protected and Scalable Storage
31Best Practices
- Several GCs
- Redundancy
- Adverse effect on replication
- GC versus DC
- Attribute-level replication can shield the master
directory - GC gets public
- DC remains highly private and protected
32Best Practices
- Carefully review and modify your schema
- No way back
- Use good quality hardware
33(No Transcript)