Quantcast
Channel: You Had Me At EHLO…
Viewing all 607 articles
Browse latest View live

Use Exchange Web Services and PowerShell to Discover and Remove Direct Booking Settings

$
0
0

Prior to Exchange 2007, there were two primary methods of implementing automated resource scheduling – Direct Booking and the AutoAccept Agent(a store event sink released as a web download for Exchange 2003). In Exchange 2007, we changed how automated resource scheduling is implemented. The AutoAccept Agent is no longer supported, and the Direct Booking method, technically an Outlook function, has been replaced with server-side calendar booking function called the Resource Booking Attendant.

Note There are various terms associated with this new Resource Booking function, such as: Calendar Processing, Automatic Resource Booking, Calendar Attendant Processing, Automated Processing and Resource Booking Assistant. We will be using the “Resource Booking Attendant” nomenclature for this article.

While the Direct Booking method for resource scheduling can indeed work on Exchange Server 2007/2010/2013, we strongly recommend that you disable Direct Booking for resource mailboxes and use the Resource Booking Attendant instead. Specifically, we are referring to the “AutoAccept” Automated Processing feature of the Resource Booking Attendant, which can be enabled for a mailbox after it has been migrated to Exchange 2007 or later and upgraded to a Resource Mailbox.

Note The published resource mailbox upgrade guidance on TechNet specifies to disable Direct Booking in the resource mailbox while still on Exchange 2003, move the mailbox, and then enable the AutoAccept functionality via the Resource Booking Attendant. This order of steps can introduce an unnecessary amount of time where the resource mailbox may be without automated scheduling capabilities.

We are currently working to update that guidance to reflect moving the mailbox first, and only then proceed with disabling the Direct Booking functionality, after which the AutoAccept functionality via the Resource Booking Attendant can be immediately enabled. This will shorten the duration where the mailbox is without automated resource scheduling capabilities.

This conversion process to resource mailboxes utilizing the Resource Booking Attendant is sometimes an honest oversight or even deliberately ignored when migrating away from Exchange 2003 due to Direct Booking’s ability to continue to work with newer versions of Exchange, even Exchange Online. This will often result in resource mailboxes (or even user mailboxes!) with Direct Booking functionality remaining in place long after Exchange 2003 is ancient history in the environment.

Why not just leave Direct Booking enabled?

There are issues that can arise from leaving Direct Booking enabled, from simple administrative burden scenarios all the way to major calendaring issues. Additionally, Resource Booking Attendant offers advantages over Direct Booking functionality:

  1. Direct Booking capabilities, technically an Outlook function, has been deprecated from the product as of Outlook 2013. It was already on the deprecation list in Outlook 2010 and required a registry modification to reintroduce the functionality.
  2. Direct Booking and Resource Booking Attendant are conflicting technologies, and if simultaneously enabled, unexpected behavior in calendar processing and item consistency can occur.
  3. Outlook Web App (as well as any non-MAPI clients, like Exchange ActiveSync (EAS) devices) cannot use Direct Booking for automated resource scheduling. This is especially relevant for Outlook Web App-only environments where the users do not have Microsoft Outlook as a mail client.
  4. The Resource Booking Attendant AutoAccept functionality is a server-side solution, eliminating the need for client-side logic in order to automatically process meeting requests.

How do I check which mailboxes have Direct Booking Enabled?

How does one validate if Direct Booking settings are enabled on mailboxes in the organization, especially if mailboxes had previously been hosted on Exchange 2003?

Screenshot: Resource Scheduling properties
Figure 1: Checking Direct Booking settings in Microsoft Outlook 2010

Unfortunately, the manual steps involve assigning permissions to all mailboxes, creating MAPI profiles for each mailbox, logging into each mailbox, checking Tools> Options> Calendar> Resource Scheduling, note which of the three Direct Booking checkboxes are checked, click OK/Cancel a few times, log out of mailbox. Whew! That can be a major undertaking even for a small to midsize company that has more than a handful of mailboxes! Having staff perform this type of activity manually can be a costly and tedious endeavor. Once you have discovered which mailboxes have the Direct Booking settings enabled, you would then have to repeat this entire process to disable these settings unless you removed them at the time of discovery.

Having an automated method to discover, track, and even disable Direct Booking settings would be nice right?

Look no further, we have the solution for you!

Using Exchange Web Services (EWS) and PowerShell, we can automate the discovery of Direct Booking settings that are enabled, track the results, and even disable them! We wrote Remove-DirectBooking.ps1, a sample script, to do exactly that and even more to aid in automating this manual effort.

After you've downloaded it, rename the file and remove the .txt extension.

Let’s break down the major tasks the PowerShell script does:

  1. Uses EWS Application Impersonation to tap into a mailbox (or set of mailboxes) and read the three MAPI properties where the Direct Booking settings are stored. It does this by accessing the localfreebusy item sitting in the NON_IPM_SUBTREE\FreeBusy Data folder, which resides in the root of the Information Store in the mailbox. The three MAPI properties and their equivalent Outlook settings the script looks at are:

    • 0x686d Automatically accept meeting requests and remove canceled meetings
    • 0x686f Automatically decline meeting requests that conflict with an existing appointment or meeting
    • 0x686e Automatically decline recurring meeting requests

    These three properties contain Boolean values mirroring the Resource Scheduling checkboxes found in Outlook (see Figure 1 above).

  2. For mailboxes where Direct Booking settings were detected, it checks for conflicts by determining if the mailbox also has Resource Booking Attendant enabled with AutomateProcessing set to AutoAccept.
  3. Optionally, disables any enabled Direct Booking settings encountered.

    Note It is important to understand that by default the script runs in a read-only mode. Additional command line switches are available to run the script to disable Direct Booking settings.

  4. Writes a detailed runtime processing log to console and log file.
  5. Creates a simple output text file containing a list of mailboxes that can be later leveraged as an input file to feed the script for disabling the Direct Booking functionality.
  6. Creates a CSV file containing statistics of the list of mailboxes processed with detailed information, such as what was discovered, any errors encountered, and optionally what was disabled. This is useful for performing analysis in the discovery phase and can also be used as another source to create an input file to feed into the script for disabling the Direct Booking functionality.

Example Scenarios

Here are a couple of example scenarios that illustrate how to use the script to discover and remove enabled Direct Booking settings.

Scenario 1

You've recently migrated from Exchange 2003 to Exchange 2010 and would like to disable Direct Booking for your company’s conference room mailboxes as well as any user mailboxes that may have Direct Booking settings enabled. The administrator’s logged in account has Application Impersonation rights and the View-Only RecipientsRBACrole assigned.

  1. On a machine that has the Exchange management tools & the Exchange Web Services API 1.2 or greater installed, open the Exchange Management Shell, navigate to the folder containing the script, and run the script using the following syntax:

    .\Remove-DirectBooking.ps1 –identity * -UseDefaultCredentials

  2. The script will process all mailboxes in the organization with detailed logging sent to the shell on the console. Note, depending the number of mailboxes in the org, this may take some time to complete
  3. When the script completes, open the Remove-DirectBooking_<timestamp>.txtfile in Notepad, which will contain list of mailboxes that have Direct Booking enabled:

    Screnshot: The Remove-Directbooking log generated by the script
    Figure 2: Output file containing list of mailboxes with Direct Booking enabled

  4. After reviewing the list, rerun the script with the InputFile parameter and the RemoveDirectBookingswitch:

    .\Remove-DirectBooking.ps1 –InputFile ‘.\Remove-DirectBooking_<timestamp>.txt’ –UseDefaultCredentials -RemoveDirectBooking

  5. The script will process all the mailboxes listed in the input file with detailed logging sent to the shell on the console. Because you specified the RemoveDirectBooking switch, it does not run in read-only mode and disables all currently enabled Direct Booking settings encountered.
  6. When the script completes, you can check the status of the removal operation by checking the Remove-DirectBooking_<timestamp>.csv file. A column called Direct Booking Removed? will record if the removal was successful. You can also check the runtime processing log file RemoveDirectBooking_<timestamp>.logas well.

    Log file results in Excel
    Figure 3: Reviewing runtime log file in Excel (see larger screeshot))

Note The Direct Booking Removed? column now shows Yes where applicable, but the three Direct Booking settings columns still show their various values as “Yes”; this is because we record those three values pre-removal. If you were to run the script again in read-only mode against the same input file, those columns would reflect a value of N/A since there would no longer be any Direct Booking settings enabled. The Resource Room?, AutoAccept Enabled?, and Conflict Detected all have a value of N/A regardless because they are not relevant when disabling the Direct Booking settings.

Scenario 2

You're an administrator who's new to an organization. You know that they migrated from Exchange 2003 to Exchange 2007 in the distant past and are currently in the process of implementing Exchange 2013, having already migrated some users to Exchange 2013. You have no idea what resources mailboxes or even user mailboxes may be using Direct Booking and would like to discover who has what Direct Booking settings enabled. You would then like to selectively choose which mailboxes to pilot for Direct Booking removal before taking action on the majority of found mailboxes.

Here's how you would accomplish this using the Remove-DirectBooking.ps1 script:

  1. Obtain a service account that has Application Impersonation rights for all mailboxes in the org.
  2. Ensure service account has at least Exchange View-Only Administrator role (2007) and at least have an RBAC Role Assignment of View Only Recipients (2010/2013).
  3. On a machine that has the Exchange management tools & the Exchange Web Services API 1.2 or greater installed, preferably an Exchange 2013 server, open the Exchange Management Shell, navigate to the folder containing the script, and run the script using the following syntax:

    .\Remove-DirectBooking.ps1 –Identity *

  4. The script will prompt you for the domain credentials of the account you wish to use because no credentials were specified. Enter the service account’s credentials.
  5. The script will process all mailboxes in the organization with detailed logging sent to the shell on the console. Note, depending the number of mailboxes in the org, this may take some time to complete.
  6. When the script completes, open the Remove-DirectBooking_<timestamp>.csvin Excel, which will looks something like:


    Figure 4: Reviewing the Remove-DirectBooking_<timestamp>.csv in Excel (see larger screeshot))

  7. Filter or sort the table by the Direct Booking Enabled? column. This will provide a list that can be scrutinized to determine which mailboxes are to be piloted with Direct Booking removal, such as those that have conflicts with already having the Resource Booking Attendant’s Automated Processing set to AutoAccept (which you can also filter on using the AutoAccept Enabled? column).
  8. Once the list has been reviewed and the targeted mailboxes isolated, simply copy their email addresses into a text file (one address per line), save the text file, and use it as the input source for the running the script to disable the Direct Booking settings:

    .\Remove-DirectBooking.ps1 –InputFile ‘.\’ -RemoveDirectBooking

  9. As before, the script will prompt you for the domain credentials of the account you wish to use. Enter the service account’s credentials.
  10. The script will process all the mailboxes listed in the input file with detailed logging sent to the shell on the console. It will disable all enabled Direct Booking settings encountered.
  11. Use the same validation steps at the end of the previous example to verify the removal was successful.

Script Options and Caveats

Please see the script’s help section (via “get-help .\remove-DirectBooking.ps1 -full”) for full information on all the available parameters. Here are some additional options that may be useful in certain scenarios:

  1. EWSURL switch parameter By default, the script will attempt to retrieve the EWS URL for each mailbox via AutoDiscover. This is preferred, especially in complex multi-datacenter or hybrid Exchange Online/On-premises environments where different EWS URLs may be in play for any given mailbox depending on where it resides in the org. However, there may be times where one would want to supply an EWS URL manually, such as when AutoDiscover is having “issues”, or the response time for AutoDiscover requests is introducing delays in overall script execution (think very large quantity of number of mailbox identities to churn through) and the EWS URL is the same across the org, etc. In these situations, one can use the EWSURL parameter to feed the script a static EWS URL.
  2. UseDefaultCredentials If the current user is the service account or perhaps simply has both the Impersonation and the necessary Exchange Admin rights per the script’s requirements and they don’t wish to be prompted to type in a credential (another great example is scheduling the script to run as a job for instance), you can use the UseDefaultCredentials to run the script under that security context.
  3. RemoveDirectBooking By default, the script runs in read-only mode. In order to make changes and disable Direct Booking settings on the mailbox, you mus specify the RemoveDirectBooking switch.

The script does have several prerequisites and caveats to ensure proper operation and meaningful results:

  1. Application Impersonation rights and minimum Exchange Admin rights must be used
  2. Exchange Web Services Managed API 1.2 or later must be installed on the machine running the script
  3. Exchange management tools must be installed on the machine running the script
  4. Script must be executed from within the Exchange Management Shell
  5. The Shell session must have the appropriate execution policy to allow the script to be executed (by default, you can't execute unsigned scripts).
  6. AutoDiscover must be configured correctly (unless the EWS URL is entered manually)
  7. Exchange 2003-based mailboxes cannot be targeted due to lack of EWS capabilities
  8. In an Exchange 2010/2013 environment that also has Exchange 2007 mailboxes present, the script should be executed from a machine running Exchange 2010/2013 management tools due to changes in the cmdlets in those versions

Summary

The discovery and removal of Direct Booking settings can be a tedious and costly process to perform manually, but you can avoid and automate it using current functions and features via PowerShell and EWS in Microsoft Exchange Server 2007, 2010, & 2013. With careful use, the Remove-DirectBooking.ps1 script can be a valuable tool to aid Exchange administrators in maintaining automated resource scheduling capabilities in their Microsoft Exchange environments.

Your feedback and comments are welcome.

Thank you to Brian Day and Nino Bilic for their guidance in content review, and to our customers (you know who you are) for piloting the script.

Seth Brandes& Dan Smith


Released: Exchange 2013 Server Role Requirements Calculator

$
0
0

It’s been a long road, but the initial release of the Exchange 2013 Server Role Requirements Calculator is here. No, that isn’t a mistake, the calculator has been rebranded.  Yes, this is no longer a Mailbox server role calculator; this calculator includes recommendations on sizing Client Access servers too! Originally, marketing wanted to brand it as the Microsoft Exchange Server 2013 Client Access and Mailbox Server Roles Theoretical Capacity Planning Calculator, On-Premises Edition.  Wow, that’s a mouthful and reminds me of this branding parody.  Thankfully, I vetoed that name (you’re welcome!).

The calculator supports the architectural changes made possible with Exchange 2013:

Client Access Servers

Like with Exchange 2010, the recommendation in Exchange 2013 is to deploy multi-role servers. There are very few reasons you would need to deploy dedicated Client Access servers (CAS); CPU constraints, use of Windows Network Load Balancing in small deployments (even with our architectural changes in client connectivity, we still do not recommend Windows NLB for any large deployments) and certificate management are a few examples that may justify dedicated CAS.

When deploying multi-role servers, the calculator will take into account the impact that the CAS role has and make recommendations for sizing the entire server’s memory and CPU. So when you see the CPU utilization value, this will include the impact both roles have!

When deploying dedicated server roles, the calculator will recommend the minimum number of Client Access processor cores and memory per server, as well as, the minimum number of CAS you should deploy in each datacenter.

Transport

Now that the Mailbox server role includes additional components like transport, it only makes sense to include transport sizing in the calculator. This release does just that and will factor in message queue expiration and Safety Net hold time when calculating the database size. The calculator even makes a recommendation on where to deploy the mail.que database, either the system disk, or on a dedicated disk!

Multiple Databases / JBOD Volume Support

Exchange 2010 introduced the concept of 1 database per JBOD volume when deploying multiple database copies. However, this architecture did not ensure that the drive was utilized effectively across all three dimensions – throughput, IO, and capacity. Typically, the system was balanced from an IO and capacity perspective, but throughput was where we saw an imbalance, because during reseeds only a portion of the target disk’s total capable throughput was utilized. In addition, capacity on the 7.2K disks continue to increase with 4TB disks now available, thus impacting our ability to remain balanced along that dimension. In addition, Exchange 2013 includes a 33% reduction in IO when compared to Exchange 2010. Naturally, the concept of 1 database / JBOD volume needed to evolve. As a result, Exchange 2013 made several architectural changes in the store process, ESE, and HA architecture to support multiple databases per JBOD volume. If you would like more information, please see Scott’s excellent TechEd session in a few weeks on Exchange 2013 High Availability and Site Resilience or the High Availability and Site Resilience topic on TechNet.

By default, the calculator will recommend multiple databases per JBOD volume. This architecture is supported for single datacenter deployments and multi-datacenter deployments when there is copy and/or server symmetry. The calculator supports highly available database copies and lagged database copies with this volume architecture type. The distribution algorithm will lay out the copies appropriately, as well as, generate the deployment scripts correctly to support AutoReseed.

High Availability Architecture Improvements

The calculator has been improved in several ways for high availability architectures:

  • You can now specify the Witness Server location, either primary, secondary, or tertiary datacenter.
  • The calculator allows you to simulate WAN failures, so that you can see how the databases are distributed during the worst failure mode.
  • The calculator allows you to name servers and define a database prefix which are then used in the deployment scripts.
  • The distribution algorithm supports single datacenter HA deployments, Active/Passive deployments, and Active/Active deployments.
  • The calculator includes a PowerShell script to automate DAG creation.
  • In the event you are deploying your high availability architecture with direct attached storage, you can now specify the maximum number of database volumes each server will support. For example, if you are deploying a server architecture that can support 24 disks, you can specify a maximum support of 20 database volumes (leaving 2 disks for system, 1 disk for Restore Volume, and 1 disks as a spare for AutoReseed).

Additional Mailbox Tiers (sort of!)

Over the years, a few, but vocal, members of the community have requested that I add more mailbox tiers to the calculator. As many of you know, I rarely recommend sizing multiple mailbox tiers, as that simply adds operational complexity and I am all about removing complexity in your messaging environments. While, I haven’t specifically added additional mailbox tiers, I have added the ability for you to define a percentage of the mailbox tier population that should have the IO and Megacycle Multiplication Factors applied. In a way, this allows you to define up to eight different mailbox tiers.

Processors

I’ve received a number of questions regarding processor sizing in the calculator.  People are comparing the Exchange 2010 Mailbox Server Role Requirements Calculator output with the Exchange 2013 Server Role Requirements Calculator.  As mentioned in our Exchange 2013 Performance Sizing article, the megacycle guidance in Exchange 2013 leverages a new server baseline, therefore, you cannot directly compare the output from the Exchange 2010 calculator with the Exchange 2013 calculator.

Conclusion

There are many other minor improvements sprinkled throughout the calculator.  We hope you enjoy this initial release.  All of this work wouldn’t have occurred without the efforts of Jeff Mealiffe (for without our sizing guidance there would be no calculator!), David Mosier (VBA scripting guru and the master of crafting the distribution worksheet), and Jon Gollogy (deployment scripting master).

As always we welcome feedback and please report any issues you may encounter while using the calculator by emailing strgcalc AT microsoft DOT com.

Ross Smith IV
Principal Program Manager
Exchange Customer Experience

Released: Exchange Server 2013 Management Pack

$
0
0

The Microsoft Exchange Server 2013 Management Pack (SCOM MP) is now live!

As I discussed in my Managed Availability article, the key difference between this management pack and previous releases, is that our health logic is now built into Exchange, as opposed to the management pack. This means updates to Exchange 2013 (like our cumulative updates), will include changes to the probes, monitors, and responders. Any issues that Managed Availability cannot solve are bubbled up to SCOM via an event monitor.

You can download the management pack via Microsoft Download Center at http://www.microsoft.com/en-us/download/details.aspx?id=39039.

You can also view the following documentation:

More information can be found at the SCOM team’s blog - http://blogs.technet.com/b/momteam/archive/2013/05/14/exchange-2013-management-pack-released.aspx.

Ross Smith IV
Principal Program Manager
Exchange Customer Experience

Using Exchange Web Services to Apply a Personal Tag to a Custom Folder

$
0
0

In Exchange 2010, we introduced Retention Tags, a Messaging Records Management (MRM) feature that allows you to manage email lifecycle. You can use retention policies to retain mailbox data for as long as it’s required to meet business or regulatory requirements, and delete items older than the specified period.

One of the design goals for MRM 2.0 was to simplify administration compared to Managed Folders, the MRM feature introduced in Exchange 2007, and allow users more flexibility. By applying a Personal Tag to a folder, users can have different retention settings apply to items in that folder than the default tag applied to the entire mailbox(known as a Default Policy Tag). Similarly, users can apply a different tag to a subfolder than the one applied to the parent folder. Users can also apply a Personal Tag to individual items, allowing them the freedom to organize messages based on their work habits and preference, rather than forcing them to move messages, based on the retention requirement, to an admin-controlled Managed Folder.

You can still use Managed Folders in Exchange 2010, but they’re not available in Exchange 2013.

For a comparison of Retention Tags with Managed Folders and migration details, see Migrate Managed Folders.

If you like the Managed Folders approach of being able to create a folder in the user’s mailbox and configure a retention setting for that folder, you can use Exchange Web Services (EWS) to accomplish something similar, with some caveats mentioned later in this post. You can write your own code or even a PowerShell script to create a folder in the user’s mailbox and apply a Personal Tag to it. There are scripts available on the interwebs, including some code samples on MSDN to accomplish this. For example:

Note: The above scripts are examples for your reference. They’re not written or tested by the Exchange product group.

But is it supported?

We frequently get questions about whether this is supported by Microsoft. Short answer: Yes. Exchange Web Services (EWS) is a supported and documented API, which allows ISVs and customers to create custom solutions for Exchange.

When using EWS in your code or PowerShell script to apply a Personal Tag to a folder, it’s important to consider the following:

For Developers

  • EWS is meant for developers who can write custom code or scripts to extend Exchange’s functionality. As a developer, you must have a good understanding of the functionality available via the API and what you can do with it using your code/script.
  • Support for EWS API is offered through our Exchange Developer Support channels.

For IT Pros

  • If you’re an IT Pro writing your own code or scripts, you’re a developer too! Above applies to you.
  • If you’re an IT Pro using 3rd-party code or scripts, including the code samples & scripts available on MSDN, TechNet or elsewhere on the interwebs, we recommend that you follow the general best practices for using such code or scripts, including (but not limited to)the following:
    • Do not use code/scripts from untrusted sources in a production environment.
    • Understand what the script or code does. (This is easy for scripts – you can look at the source in a text editor.)
    • Test the script or code thoroughly in a non-production environment, including all command-line options/parameters available in it, before installing or executing it in your production environment.
    • Although it’s easy to change the PowerShell execution policy on your servers to allow unsigned scripts to execute, it’s recommended to allow only signed scripts in production environments. You can easily sign a script if it's unsigned, before running it in a production environment.

So should I do it?

If using EWSto apply a Personal Tag to custom folders helps you meet your business requirements, absolutely! However, do note and consider the following:

  • You’re replicating some of the functionality available via Managed Folders, but it doesn’t turn the folder into a Managed Folder.
  • Remember - it’s a Personal Tag! Users can remove the tag from the folder using Outlook or Outlook Web App.
  • If you have additional Personal Tags available in your environment, users can change the tag on the custom folder.
  • Users can tag individual items with a different Personal Tag. There is no way to enforce inheritance of retention tag if Personal Tags have been provisioned and available to the user.
  • Users can rename or delete custom folders. Unlike Managed Folders, which are protected from changes or deletion by users, custom folders created by users or by admin are just like any other (non-default) folder in the mailbox.

Provisioning custom folders with different retention settings (by applying Personal Tags) may help you meet your organization’s retention requirements. As an IT Pro, make sure you understand the above and follow the best practices.

Bharat Suneja

Ambiguous URLs and their effect on Exchange 2010 to Exchange 2013 Migrations

$
0
0

With the recent releases of Exchange Server 2013 RTM CU1, Exchange 2013 sizing guidance, Exchange 2013 Server Role Requirements Calculator, and the updated Exchange 2013 Deployment Asistant, on-premises customers now have the tools you need to begin designing and performing migrations to Exchange Server 2013. Many of you have introduced Exchange 2013 RTM CU1 into your test environments alongside Exchange 2010 SP3 and/or Exchange 2007 SP3 RU10, and are readying yourselves for the production migrations.

There's one particular Exchange 2010 design choice some customers made that could throw a monkey wrench into your upgrade plans to Exchange 2013, and we want to walk you through how to mitigate it so you can move forward. If you're still in the design or deployment phase of Exchange Server 2010, we recommend you continue reading this article so you can make some intelligent design choices which will benefit you when you migrate to Exchange 2013 or later.

What is the situation we need to look for?

In Exchange 2010, all Outlook clients in the most typical configurations will utilize MAPI/RPC or Outlook Anywhere (RPC over HTTPS) connections to a Client Access Server. The MAPI/RPC clients connect to the CAS Array Object FQDN (also known as the RPC endpoint) for Mailbox access and the HTTPS based clients connect to the Outlook Anywhere hostname (also known as the RPC proxy endpoint) for all Mailbox and Public Folder access. In addition to these primary connections, other HTTPS based workloads such as EAS, ECP, OAB, and EWS may be sharing the same FQDN as Outlook Anywhere. In some environments you may also be sharing the same FQDN with POP/IMAP based clients and using it as an SMTP endpoint for internal mail submissions.

In Exchange 2010, the recommendation was to utilize split DNS and ensure that the CAS Array Object FQDN was only resolvable via DNS by internal clients. External clients should never be able to resolve the CAS Array Object FQDN. This was covered previously in item #4 of Demystifying the CAS Array Object - Part 2. If you put those two design rules together you come to the conclusion your ClientAccessArray FQDN used by the mailbox database RpcClientAccessServer property should have been an internal-only unique FQDN not utilized by any workload besides MAPI/RPC clients.

Take the following chart as an example of what a suggested configuration in a split DNS configuration would have looked like.

FQDNUsed ByInternal DNS resolves toExternal DNS resolves to
mail.contoso.comAll HTTPS WorkloadsInternal Load Balancer IPPerimeter Network Device
outlook.contoso.comMAPI/RPC WorkloadsInternal Load Balancer IPN/A

If your do not utilize split DNS, then a suggested configuration may have been.

FQDNUsed ByDNS resolves to
mail.contoso.comExternal HTTPS WorkloadsPerimeter Network Device
mail-int.contoso.comInternal HTTPS WorkloadsInternal Load Balancer IP
outlook.contoso.comInternal MAPI/RPC WorkloadsInternal Load Balancer IP

In speaking with our Premier Field Engineers and MCS consultants, we learned that some of our customers did not choose to use a unique ClientAccessArrayFQDN. This design choice may manifest itself in one of two ways. The MAPI/RPC and HTTPS workloads may both utilize the mail.contoso.com FQDN internally and externally, or a unique external FQDN of mail.contoso.com is used while internal MAPI/RPC and HTTPS workloads share mail-int.contoso.com. The shared FQDN in either situation is ambiguous because we can't look at it and immediately understand the workload type that's using it. Perhaps we were not clear enough in our original guidance, or customers felt fewer names would help reduce overall design complexity since everything appeared to work with this configuration.

Take a look at the figure below and the FQDNs in use for some of the different workloads. Shown are EWS, ECP, OWA, CAS Array Object, and Outlook Anywhere External Hostname. The yellow arrow specifically points out the CAS Array Object, the value used as the RpcClientAccessServer for Exchange 2010 mailbox databases, and seen in the Server field of an Outlook profile for an Exchange 2010 mailbox.

image
An Exchange 2010 deployment with a single ambiguous URL for all workloads.

Let us pause for a moment to visualize what we have talked about so far. If we were to compare an Exchange 2010 environment using ambiguous URLs to one not using ambiguous URLs, it would look like the following diagrams. Notice the first diagram below uses the same FQDN for Outlook MAPI/RPC based traffic and HTTPS based traffic.

image

If we were to then look at an environment not utilizing ambiguous URLs, we see the clients utilize unique FQDNs for MAPI/RPC based traffic and HTTPS based traffic. In addition, the FQDN utilized for MAPI/RPC based traffic is only resolvable via internal DNS.

image

If your environment does not look like the one above using ambiguous URLs, then you can go hit the coffee shop for a while or play some XBOX 360. Tell your boss we gave the okay. If your environment does look similar to the first example using ambiguous URLs or you are in the planning stages for Exchange 2010, then please read on as we need you to perform some extra steps when migrating to Exchange 2013.

So what’s the big deal? It is functional this way isn’t it?

While this may be working for you today, it certainly will not work tomorrow if you migrate to Exchange 2013. In this scenario where both the MAPI/RPC and HTTP workloads are using the same FQDN you cannot successfully move the FQDN to CAS 2013 without breaking your MAPI/RPC client connectivity entirely. I repeat, your MAPI/RPC clients will start failing to connect via MAPI/RPC once their DNS cache expires after the shared FQDN is moved to CAS 2013. The MAPI/RPC clients will fail to connect because CAS 2013 does not know how to handle direct MAPI/RPC connections as all Windows based Outlook clients utilize MAPI over a RPC over HTTPS connection in Exchange 2013. There is a chance your Outlook clients may successfully fall back to HTTPS only if Outlook Anywhere is currently enabled for Exchange 2010 when the failure to connect via MAPI/RPC takes place, but this article will help with the following.

  1. Ensure you are in full control of what will take place
  2. Ensure you are in full control of when #1 takes place
  3. Ensure you are in a supported server + client configuration
  4. Ensure environments with Outlook Anywhere disabled for Exchange 2010 know their path forward
  5. Help remove the possibility of any clients not automatically falling back to HTTPS
  6. Remove the potentially long delay when Outlook does fail to via MAPI/RPC even though it can resolve the MAPI/RPC URL and then falls back to HTTPS

Shoot… this looks like us. What should we do immediately?

First off, if you are still in the planning stages of Exchange 2010 you need to take our warning to heart and immediately change your design to use a specific internal-only FQDN for MAPI/RPC clients. If you are in the middle of a 2010 deployment using an Ambiguous URL I recommend you change your ClientAccessArray FQDN to a unique name and update the mailbox database RpcClientAccessServer values on all Exchange 2010 mailbox databases accordingly. Fixing this item mid-migration to Exchange 2010 or even in your fully migrated environment will ensure any newly created or manually repaired Outlook profiles are protected, but it will not automatically fix existing Outlook clients with the old value in the server field.

While not necessary as long as you go through our mitigation steps below, any existing Outlook profiles could be manually repaired to reflect the new value. If you are curious why a manual repair is necessary you can refer to items #5 and #6 in Demystifying the CAS Array Object - Part 2. Again, forcing this update is not necessary if you follow our mitigation steps later in this article. However, if you were to choose to update some specific Outlook profiles we suggest you perform those steps in your test environment first to make sure you have the process down correctly.

Additionally as we previously discussed in item #3 of Demystifying the CAS Array Object – Part 1, the ClientAccessArray FQDN is not needed in your SSL certificate as it is not being used for HTTPS based traffic. Because of this, the only thing you would need to do is create a new internal DNS record, update your ClientAccessArray FQDN, and finally update your Exchange 2010 Mailbox Database RpcClientAccessServer values. It bears repeating that you do not have to get a new SSL certificate only to fix an Ambiguous URL situation.

Ok, fixed that… now what about the clients we don’t want to repair manually?

Our suggestion is to implement Outlook Anywhere internally for all users prior to introducing Exchange Server 2013 to the environment.

Many of our customers have already moved to Outlook Anywhere internally for all Windows Outlook clients. In fact, those of you reading this with OA in use internally are good to proceed to the coffee shop or go play XBOX 360 with the other folks if you’d like to.

Now for the rest of you… sit a little closer. Go ahead and fill in, there are plenty of seats in the front row like usual.

In Exchange Server 2013 all Windows Outlook clients operate in Outlook Anywhere mode internally. By following these mitigation steps you will be one step ahead of where you will end up after your migration to Exchange Server 2013 anyways.

If you do not have Outlook Anywhere enabled at all in your environment, please see Enable Outlook Anywhere on TechNet for steps on how to enable it in Exchange 2010. If your company does not wish to provide external access for Outlook Anywhere that is ok. By simply enabling Outlook Anywhere you will not be providing remote access unless you also publish the /rpc virtual directory to the Internet.

It is suggested customers, especially very large ones, consider enabling Kerberos authentication to avoid any potential performance issues you may run into utilizing the default NTLM authentication. Information on how to configure Kerberos Authentication can be found here on TechNet for Exchange Server 2010 and the steps for Exchange Server 2013 are similar which we will have documentation for in the near future. However, please keep in mind Kerberos authentication with Outlook Anywhere is only supported with Windows Vista or later.

By default with Outlook Anywhere enabled in the environment your clients prefer RPC/TCP connections when on Fast Networks as seen below.

image

The trick we use to force Outlook Anywhere to also be used internally is via Autodiscover. Using Autodiscover we can make Windows Outlook clients prefer RPC/HTTPS on both Fast and Slow networks as seen here.

image

The method used to make clients always prefer HTTPS is configuring the OutlookProviderFlags option via the Set-OutlookProvider cmdlet. The following commands are executed from the Exchange 2010 Management Shell.

Set-OutlookProvider EXPR -OutlookProviderFlags:ServerExclusiveConnect

Set-OutlookProvider EXCH -OutlookProviderFlags:ServerExclusiveConnect

If for any reason you need to put the configuration back to its default settings, issue the following commands and clients will no longer prefer HTTP on Fast Networks.

Set-OutlookProvider EXPR -OutlookProviderFlags:None

Set-OutlookProvider EXCH -OutlookProviderFlags:None

You can prepare to introduce Exchange Server 2013 to your environment once all of your Windows Outlook clients are preferring HTTP on both fast and slow networks and are connecting through mail.contoso.com for RPC over HTTPS connections.

There are a small number of things we would like to call out as you plan this migration to enable Outlook Anywhere for all internal clients.

First, your front end infrastructure (CAS 2013, Load Balancer, etc…) must ready to immediately handle the full production load of Windows Outlook clients when you re-point the mail.contoso.com FQDN in DNS.

Second, if your Exchange 2010 Client Access Servers were not scaled for 100% Outlook Anywhere connections then performance should be monitored when OA is enabled and all clients are moved from MAPI/RPC based to HTTPS based workloads. You should be ready to scale out your CAS 2010 infrastructure if necessary to mitigate any possible performance issues.

Lastly, Windows Outlook clients older than Outlook 2007 are not supported going through CAS 2013 even if their mailbox is on an older Exchange version. All Windows Outlook clients going through CAS 2013 have to be at least the minimum versions supported by Exchange 2013. Any unsupported clients, such as Outlook 2003, do not support Autodiscover and would have to be manually with a new MAPI/RPC specific endpoint to assure they continue communicating with Exchange 2010 until the client can be updated and the mailbox migrated to Exchange 2013.

Note: The easiest way to confirm what major/minor version of Outlook you have is to look at the version of OUTLOOK.EXE and EMSMDB32.DLL via Windows Explorer or to run an inventory report through Microsoft System Center Configuration Manager or similar software. The minimum version numbers Exchange Server 2013 supports for on-premises deployments are provided below.

  • Outlook 2007: 12.0.6665.5000 (SP3 + the November 2012 Public Update or any later PU)
  • Outlook 2010: 14.0.6126.5000 (SP1 + the November 2012 Public Update or any later PU)
  • Outlook 2013: 15.0.4420.1017 (RTM or later)

If we were to visualize the mitigation steps from start to end we need to compare it between phases.

First, the upper area of the below diagram depicts the start state of the environment with internal Windows Outlook clients utilizing MAPI/RPC and ambiguous URLs for their HTTPS based workloads. The lower area of the diagram depicts the same environment, but we have now forced Outlook Anywhere to be used by internal Windows Outlook clients. This change has forced all mailbox and public folder access traffic over HTTPS through the mail.contoso.com Outlook Anywhere FQDN.

image

We now have all Windows Outlook clients utilizing Outlook Anywhere internally by levering Autodiscover to force the preference of HTTPS. Now that all Windows Outlook traffic is routed through mail.contoso.com via HTTPS, the ambiguous URL problem has been mitigated. However, you may have other applications integrating with Exchange whom are unable to utilize Outlook Anywhere and/or Autodiscover. These applications will also be affected if you were to update the mail.contoso.com DNS entry to point at Exchange 2013. Before moving onto the second step it may be most efficient to add a HOSTS file entry on the servers hosting these external applications to force resolution of mail.contoso.com to the Layer-7 Load Balancer used by Exchange 2010. This should allow you to temporarily continue routing external application traffic that needs to talk to only Exchange 2010 via MAPI/RPC while you work on updating the applications to be Outlook Anywhere compatible, which they will need to be before they can ever connect to Exchange 2013.

Having dealt with both the Windows Outlook clients and third-party applications whom cannot utilize Outlook Anywhere, we can now move onto the second step. The second step is executed when you are ready to introduce Exchange 2013 to the environment.

The below diagram starts by showing where we finished after executing step one. The lower area of the below diagram shows that we have updated DNS to point the mail.contoso.com entry to the new IP of the new Exchange 2013 load balancer configuration. Because of the HOSTS entry we made our application server continues talking to the old Layer-7 load balancer for its MAPI over RPC/TCP connections. Exchange 2013 CAS will now receive all client traffic and then we proxy traffic for users still on Exchange 2010 back to the Exchange 2010 CAS infrastructure. The redundant CAS was removed from the diagram to simplify the view and simply show traffic flow.

image

In summary, we hope those of you in this unique configuration will be able to smoothly migrate from Exchange 2010 to Exchange 2013 now that you have these mitigation steps. Some of you may identify other potential methods to use and wonder why we are offering only a single mitigation approach. There were many methods investigated, but this mitigation approach came back every time as the most straightforward method to implement, maintain, and support. Given the potential complexity of this change we invite you to ask follow-up questions at the follow Exchange Server Forum where we can often better interact with you than the comments format allows.

Exchange Server Forum: Exchange Server 2013 – Setup, Deployment, Updates, and Migration

Brian Day
Senior Program Manager
Exchange Customer Experience

Released: Calculator Updates Galore!

$
0
0

Today, we have released an updated version of the Exchange 2013 Server Role Requirements Calculator that addresses several issues found since its initial release.  You can view what changes have been made, or download the update directly.

In addition, we are releasing an updated version of the Exchange 2010 Server Role Requirements Calculator as well. You can view what changes have been made, or download the update directly.

Ross Smith IV
Principal Program Manager
Exchange Customer Experience

Comparing public folder item counts

$
0
0

A question that is often asked of Support in regard to legacy Public Folders is whether they're replicating and how much progress they're making.  The most common scenario arises when the administrator is adding a new Public Folder database to the organization and replicating a large amount of data to it.  What commonly happens is that the administrator calls Support and says:

The database on the old server is 300GB, but the new database is only 150GB!  How can I tell what still needs to be replicated?  Is it still progressing??

You can raise diagnostic logging for public folders, but reading the events to see which folders are replicating is tedious.  Most administrators want a more detailed way of estimating the progress of replication than comparing file sizes.  They also want to avoid checking all the individual replication events.

There are a number of ways to monitor replication progress so that one can make an educated guess as to how long a particular environment will take to complete an operation.  In this post, I'm going to provide a detailed example of one approach to estimating the progress of replication by comparing item counts between different public folder stores.

Getting Public Folder item counts

To get the item counts in an Exchange 2003 Public folder database you can use PFDAVAdmin.  The process is outlined in this previous EHLO blog post.  For what we're doing below, you'll need the DisplayName, Folderpath and the total number of items in the folder. The rest of the fields aren't necessary.

To get the item counts on an Exchange 2007 server, use (remember there is only one Pub per server):

Get-PublicFolderStatistics -Server <servername> | Export-Csv c:\file1.txt

To get the item counts on an Exchange 2010 server, you use:

Get-PublicFolderStatistics -Server <servername> -ResultSize unlimited | Export-Csv c:\file1.txt

Comparing item counts

There are some very important caveats to this whole procedure.  The things you need to watch out for are:

  • We're only checking item counts.  If you delete 10 items and add 10 items between executions of the statistics data, gathering this type of query will not reveal whether they have replicated.  Therefore, having the same number on both sides is not necessarily an assurance that the folders are in sync
  • If you're comparing folders that contain recurring meetings, the item counts can be different on Exchange 2007 and older because of the way WebDAV interacts with those items.
  • I've seen many administrators try to compare the size of one Public Folder database to the size of another.  Such an approach to checking on replication does not take into account space for deleted items, overhead and unused space.  Checking item counts is more reliable than simply comparing item sizes
  • The two databases might be at very different stages of processing replication messages.  It is unlikely that both pubs will present the same numbers of items if the folders are continuously active.  Even if the folders are seeing relatively low activity levels, it's not uncommon for the item count to be off by one or two items because the replication cycle (which defaults to every 15 minutes) simply hasn’t gotten to the latest post
  • If you really want to know if two replicas are in sync, try to remove one.  If Exchange lets you remove the instance, you know Exchange believes the folders are in sync.  If Exchange cannot confirm the folders are in sync, it'll keep the instance until it can complete the backfill from it.  In most cases, the administrators I have spoken with are not in a position where they can use this approach.

For the actual comparison you can use any number of products.  For this blog I have chosen Microsoft Access for demonstrating the process of comparing the CSV files from the different servers.  To keep things simple I am going to use the Access database.  There are some limitations to my approach:

  • Access databases have a maximum file size of 2GB. If your public folder infrastructure is particularly large (i.e.  your CSV files are over 500MB) you may have to switch to using Microsoft SQL.
  • I am not going to compare public folders with a Folder path greater than 254 characters because the Jet database engine that ships with Access cannot join memo fields in a query.  Working around the join limitation by splitting the path across multiple text fields is beyond the scope of this blog.
  • I am going to look at folders that exist in both CSV files.   If the instance has not been created and its data exported into the CSV file the folder will not be listed.

An outline of the process is:

  1. Export the item counts from the two servers you wish to compare
  2. Import the resulting text files
  3. Clean up the data for the final query
  4. Run a query to list the item counts for all folders that are in Both files and the difference in the item counts between the originally imported files

Assumptions for the steps below:

  • You have exported the public folder statistics with the PowerShell commands presented above
  • You have fields named FolderPath, ItemCount and Name in the CSV file

If your file is different than expected you will have to modify the steps as you go along

Here are the steps for conducting the comparison:

1. Create a new blank Microsoft Access database in a location that has more than double the size of your CSV files available as free space.

2. By default, the Export-Csv cmdlet includes the .NET type information in the first line of the CSV output. Because this line will interfere with the import, we'll need to remove it.  Open each CSV file in notepad (this can take a while for larger files) and remove the line highlighted below.  In this example the line starting with “AdminDisplayName” would become the topmost line of the file.  Once the top line is deleted close and save the file.

image
Figure 1

TIP You can avoid this step by including the -NoTypeInformation switch when using the Export-CSV cmdlet, which filters out the .NET object type information from the CSV output. For details, see Using the Export-Csv cmdlet on TechNet. (Thanks to #MSExchange MVP @SteveGoodman for the tip!)

3. Import the CSV file to a new table:

  • Click on the External Data tab as highlighted in Figure 2
  • Browse to the CSV file and select it (or type in its path and name directly)
  • Make sure the “Import the source data into a new table in the current database’ option is selected
  • Click OK

image
Figure 2

4. In the wizard that starts specify the file is delimited as shown and then click Next.

image
Figure 3

5. Tell the wizard that the text qualifier is the double quote (character 34 in ASCII), the delimiter is the comma and that the “First Row Contains Field Names” as shown in Figure 4.

Note:  It is possible that you will receive a warning when you click “First Row Contains Field Names”.  If any of the field names violate the rules for a field name Access will display a warning.  Don’t panic.  Access will replace the non-conforming names with ones it considers appropriate (typically Field1, Field2, etc.).  You can change the names if you wish on the Advanced screen.

image
Figure 4

6. Switch to Advanced view (click the Advanced button highlighted in Figure 4) so that we can change the data type of the FolderPath field.  In Access 2010 and older the data type needs to be changed from Text to Memo.  In Access 2013 it needs to be changed from Short Text to Long Text.  While we are in this window you have the option to exclude columns that are not needed by placing a checkmark in the box from the skip column.  In this blog we are only going to use the FolderPath, name and the item count.  You can also exclude fields earlier in the process by specifying what fields will be exported when you do the export-csv.  The following screenshots show the Advanced properties window.

image
Figure 5a: Access 2010 and older

image
Figure 5b: Access 2013

Note:  If you think you will be doing this frequently you can use the Save As button to save your settings.  The settings will be saved inside the Access database and can then be selected during future imports by clicking on the Specs button.

7. Click OK on the Advanced dialog and then click Finish in the wizard.

8. When prompted to save the Import steps click Close.  If you think you will be repeating this process in the future feel free to explore saving the import steps.

9. Access will import the data into a table.  By default the table will have the same name as the source CSV file.  The files used in creating this blog were called 2007PF_120301 and 2010 PF_120301.  If there are any import errors they will be saved in a separate table.  Take a moment to examine what they are.  The most common is that a field got truncated.  If that field is the folderpath it will affect the comparisons later.  If there are other problems you will have to troubleshoot what is wrong with the highlighted lines (typically there should be no import errors as long as the FolderPath is set as a Memo field).

10. Go back to Step 2 to import the second file that will be used in the comparison. 

11. Now a query must be run to determine if any folderpath exceeds 255 characters.  Fields longer than 255 characters cannot be used for a join in an Access query.  If we have values that exceed 255 characters in this field we will need to exclude them from the comparison.  Additional work to split a long path across multiple fields can be done, but that is being left as an exercise for any Access savvy readers. 

12. To get started select the options highlighted in Yellow in Figure 6:

image
Figure 6

13. Highlight the table where we want to check the length of the folderpath field as shown in Figure 7.  Once you have selected the table click Add and then Close:

image
Figure 7

14. Switch to SQL view as shown in Figure 8:

image
Figure 8

15. Replace the default select statement with one that looks like this (please make sure you substitute your own table name for the one that I have Bolded in the example):

SELECT Len([FolderPath]) AS Expr1, [2007PF_120301].FolderPath
FROM 2007PF_120301
WHERE (((Len([FolderPath]))>254));

Note:  Be sure the semi-colon is the last character in the statement.

16. Run the query using the red “!” as shown in Figure 9: 

image
Figure 9

image
Figure 10

17. If the result is a single empty row (as shown in Figure 10) then skip down to step 19.  If the result is at least one row then go back to SQL view (as shown in Figure 8) and change the statement to look like this one (as before please make sure 2007PF_120301 is replaced with the table name actually being used in your database):

SELECT [2007PF_120301].FolderPath, [2007PF_120301].ItemCount,
[2007PF_120301].Name, [2007PF_120301].Identity INTO 2007PF_120301_trimmed
FROM 2007PF_120301
WHERE (((Len([FolderPath]))<255));

18. You will get a prompt like the one in Figure 11 when you run the query.  Select Yes:

image
Figure 11

19. After it is done repeat steps 11-18 for the other CSV file that was imported to be part of the comparison.  If you have done steps 11-18 for both files you will be comparing then advance to step 20.

20. Originally the FolderPath was imported as a memo field (Long Text if using Access 2013).  However we cannot join memo fields in a query.  We need to convert them to a text field with a length of 255. 

If you got a result greater than zero rows in step 16 this step and the subsequent steps will all be carried out on the table specified in the INTO clause of the SQL statement (in this blog that table is named 2007PF_120301_trimmed). 

If you were able to skip steps 17 and 18 this step and the subsequent steps will be carried out on the table you imported (2007PF_120301 in this example).

Open the table in Design view by right-clicking on it and selecting Design View as shown in Figure 12.  If you select the wrong tables for the subsequent steps you will get a lot of unwanted duplicates in your final comparison output.

image
Figure 12

21. Change the folderpath from Memo to Text as shown in Figure 13.  If you are using Access 2013 change it from Long Text to Short Text.

image
Figure13

22. With the FolderPath field highlighted look to the lower part of the Design window where the properties of the currently selected field are displayed.  Change the field size of folderpath to 255 characters as shown in Figure 14.

image
Figure 14

23. Save the table and close its design view.  You will be prompted as shown in Figure 15.  Don’t panic.  All the folderpaths should be shorter than the 255 characters specified in the properties of the table.  The dialog is just a standard warning from Access.  No data should be truncated (the earlier queries should have seen to that).  Say Yes and repeat steps 20-23 for the other table being used in this comparison.  If you make a mistake here remember that you will still have your original CSV files and can always fix the mistake by removing the tables and redoing the import.

image
Figure 15

24. We have been on a bit of a journey to make sure we prepared the tables.  Now for the comparison.  Create a new query (as shown in Figure 6) and highlight both tables that have had the FolderPath shortened to 255 characters as shown in Figure 16.  Once they are highlight click Add and then close.

image
Figure 16

25. Drag Folderpath from the table that is the source of your replication to Folderpath on the other database.  The result will look like Figure 17.

image
Figure 17

26.   In the top half of the Query Design window we have the tables with their fields listed.  In the bottom half we have the query grid.  You can make fields appear in the grid in 3 ways:

  • Switch the SQL view and add them to the Select statement
  • Double-click the field in the top half of the window
  • Drag the field from the top half of the window to the grid
  • Click in the Field line of the grid and a drop down will appear that you can use to select the fields
  • Type the field name you want on the Field in the grid

For this step we need to add:

  • One copy of the folderpath field from one table (doesn’t matter which one)
  • The ItemCount field from each table

27.   Go to an empty column in the grid.  We need to enter the text that will tells us the difference between the two item counts.  Type the following text into the column (be sure to use the table names from your own database and not my example): 

Expr1:  Abs([2007PF_120301_trimmed].[itemcount]-[2010pf_120301_trimmed].[itemcount])

Note:  After steps 25-27 the final result should look like  Figure 18.  The equivalent SQL looks like this:

SELECT [2007PF_120301_trimmed].FolderPath, [2007PF_120301_trimmed].ItemCount, [2010PF_120301_trimmed].ItemCount, Abs([2007PF_120301_TRIMMED].[ItemCount]-[2010PF_120301_TRIMMED].[ItemCount]) AS Expr1
FROM 2007PF_120301_trimmed INNER JOIN 2010PF_120301_trimmed ON [2007PF_120301_trimmed].FolderPath = [2010PF_120301_trimmed].FolderPath;

image
Figure 18

28. Run the query using the red “!” shown in Figure 9.  The results will show you all the folders that exist in BOTH public folder databases, the itemscount in each database and the difference between them.  I like the difference reported as a positive number, but you might prefer to remove the absolute value function.

There is more that can be done with this.  You can use Access to run a Find Unmatched query to find all items from one table that are not in the other table (thus locating folders that have an instance in one database, but not the other).  You can experiment with different Join types in the query and you can deal with Folderpaths longer than a single text field can accommodate.  These and any other additional functionality you desire are left as an exercise for the reader to tackle.  I hope this provides you with a process that can be used to compare the item counts between two Public Folder stores (just remember the caveats at the top of the article).

Thanks To Bill Long for reviewing my caveats and Oscar Goco for reviewing my steps with Access.

Chris Pollitt

Released: Update Rollup 1 for Exchange Server 2010 SP3

$
0
0

Today the Exchange CXP team released Update Rollup 1 for Exchange Server 2010 SP3 to the Download Center.

Note: Some of the following KB articles may not be available at the time of publishing this post.

This update contains fixes for a number of customer-reported and internally found issues. For more details, including a list of fixes included in this update, see KB 2803727. We would like to specifically call out the following fixes which are included in this release:

  • 2561346 Mailbox storage limit error when a delegate uses the manager's mailbox to send an email message in an Exchange Server 2010 environment
  • 2756460 You cannot open a mailbox that is located in a different site by using Outlook Anywhere in an Exchange Server 2010 environment
  • 2802569 Mailbox synchronization fails on an Exchange ActiveSync device in an Exchange Server 2010 environment
  • 2814847 Rapid growth in transaction logs, CPU use, and memory consumption in Exchange Server 2010 when a user syncs a mailbox by using an iOS 6.1 or 6.1.1-based device
  • 2822208 Unable to soft delete some messages after installing Exchange 2010 SP2 RU6 or SP3

For DST changes, see Daylight Saving Time Help and Support Center (microsoft.com/time).

A known issue with Exchange 2010 SP3 RU1 Setup

You cannot install or uninstall Update Rollup 1 for Exchange Server 2010 SP3 on the double-byte character set (DBCS) version of Windows Server 2012 if the language preference for non-Unicode programs is set to the default language. To work around this issue, you must first change this setting. To do this, follow these steps:

  1. In Control Panel, open the Clock, Region and Language item, and then click Region.
  2. Click the Administrative tab.
  3. In the Language for non-Unicode programs area, click Change system locale.
  4. On the Current system locale list, click English (United States), and then click OK.

After you successfully install or uninstall Update Rollup 1, revert this language setting, as appropriate.

We have identified the cause of this problem and plan to resolve it in a future rollup, but did not want to further delay the release of RU1 for customers who are not impacted by it.

Exchange Team


Exchange at TechEd North America 2013

$
0
0

TechEd North America 2013 happens next week in New Orleans, Louisiana. This year, there are several Exchange and Office 365 break-out sessions and hands-on labs for IT pros and developers, including sessions on Exchange 2013 high availability, virtualization, hybrid deployments, managed availability, retention, archiving & eDiscovery, DLP, site mailboxes, modern public folders, transport, unified messaing, Outlook Web App, EWS, and more!

Monday, June 3, 2013
1:15 PM-2:30 PMOUC-B206 - A Look Inside Microsoft Office 365 Alistair Speirs
OUC-B215 - Understanding Compliance in Microsoft Exchange, SharePoint, and Office Bharat Suneja
3:00 PM-4:15 PMSES-B205 - Overview of eDiscovery across the Microsoft Office Platform Georgiana Badea
OUC-B202 - Choosing the Right Cloud Service Alexander Bradley&Danny Burlage
OUC-B315 - Microsoft Exchange Server 2013 Managed Availability Ross Smith IV
3:00 PM-4:15 PMOUC-B313 - Microsoft Exchange Server 2013 Client Access Server Role Greg Taylor
4:45 PM-6:00 PM OUC-B334 - Migration and Coexistence with Microsoft Lync Server 2013 Justin Morris
Tuesday, June 4, 2013
8:30 AM-9:45 AM OUC-B211 - Overview of Microsoft Office 365 Identity Management Paul Andrew
OUC-B314 - Microsoft Exchange Server 2013 High Availability and Site Resilience Scott Schnoll
OUC-B327 - Microsoft Lync Hybrid Scenarios Abi Maggu
10:15 AM-11:30 AMOUC-B203 - Collaborating with the New Microsoft Office Web Apps Amanda Lefebvre, Nick Simons&Dan Zarzar
OUC-B305 - Enterprise Network Requirements for Microsoft Lync Server 2013 Bryan Nyce
OUC-B317 - Microsoft Exchange Server 2013 Sizing Jeff Mealiffe
1:30 PM-2:45 PMOUC-B201 - Become a Microsoft Office Ninja in 60 Minutes Tal Kryzpow
OUC-B319 - Microsoft Exchange Server 2013 Transport Architecture Ross Smith IV
OUC-B333 - Lap Around the Microsoft Lync 2013 Developer Platform Girija Bhagavatula,&Albert Kooiman
3:15 PM-4:30 PMOUC-B208 - Deploying Microsoft Office? Begin Here! Jill Maguire&Curtis Sawin
OUC-B304 - Developing Mobile Apps with Microsoft Exchange Web Services Paul Robichaux
OUC-B324 - Planning and Deploying Your Enterprise Voice Geoff Clark
OUC-B326 - Virtualization in Microsoft Exchange Server 2013 Jeff Mealiffe
5:00 PM-6:15 PMOUC-B217 - Microsoft Office 365 Pro Plus Adoption and Change Management Brent Whichel
OUC-B307 - Get Moving with Your Mailbox!Jaap Wesselius
OUC-B332 - Planning and Deploying Conferencing in Microsoft Lync Server 2013 Scott Johnson&Andrew Sniderman
OUC-B316 - Microsoft Exchange Server 2013 On-Premises Upgrade and Coexistence Robert Gillies/span>
Wednesday June 5, 2013
8:30 AM-9:45 AMOUC-B209 - Microsoft Office 365 for Education: Overview and Upgrades Jim Lucey
OUC-B311 - Microsoft Exchange Hybrid Deployment and Migration On Your Terms Neil Axelrod
OUC-B405 - Deep Dive into New Unified Communications Web API of Lync 2013 Girija Bhagavatula&Albert Kooiman
10:15 AM-11:30 AMOUC-B328 - Planning and Deployment for Edge Server with Microsoft Lync Server 2013 Bryan Nyce
OUC-B205 - Security in Microsoft Office 365 Paul Andrew&Andy O'Donald
OUC-B310 - Microsoft Exchange Archiving Policy: Move, Delete, or Hold Dheepak Ramaswamy
OUC-B210 -Team Collaboration with Site Mailboxes Alfons Staerk
1:30 PM-2:45 PMOUC-B214 - The Deep Dark Secrets of Unified Messaging J. Peter Bruzzese
OUC-B331 - Voice Interoperability Fundamentals Francois Doremieux&Scott Johnson
OUC-B218 - Understanding Immersive Productivity and Collaboration Experiences with Perceptive Pixel Devices Tim Bakke
3:15 PM-4:30 PMOUC-B322 - Using Windows PowerShell Magic to Manage Microsoft Office 365 Danny Burlage
OUC-B330 - Mobile Devices Deep Dive with Microsoft Lync Server 2013 Geoff Clark
OUC-B318 - Microsoft Exchange Server 2013 Tips & Tricks Scott Schnoll
5:00 PM-6:15 PMOUC-B212 - Help Small Businesses Seize the Day with Microsoft Office 365Andy O'Donald
OUC-B301 - Data Loss Prevention in Microsoft Exchange and Microsoft Outlook 2013 Jack Kabat
OUC-B320 - Microsoft System Center Advisor and System Center 2012 - Operations Manager: Better Together Nick Rosenfeld
Thursday June 6, 2013
8:30 AM-9:45 AMOUC-B02 - Deploying and Updating Microsoft Office 365 ProPlus with Click-to-Run Daniel H. Brown&Jeremy Chapman
OUC-B312 - Microsoft Exchange in the Cloud: Scared of Losing Your Job? Jaap Wesselius
OUC-B401 - Microsoft Lync Server 2013 Dial Plan and Voice Routing Deep Dive Geoff Clark&Bryan Nyce
10:15 AM-11:30 AMOUC-B335 - Scripting and Automation for Microsoft Lync Kevin Peters
OUC-B308 - Internals of Deploying the In-Place Archive: Online, On-Premises, or Hybrid Dheepak Ramaswamy
OUC-B207 - The New Outlook Web App: Designed for Touch and Offline Too! Kip Fern&Paul Limont
1:00 PM-2:15 PMOUC-B216 - Microsoft Office 365 Service Communications Katy Olmstead
OUC-B204 - Network Design and Deployment Strategies to Ensure Success for Microsoft Lync Server 2013 Enterprise Voice Manfred Arndt
OUC-B329 - Modern Public Folders Overview, Migration and Microsoft Office 365 Siegfried Jagott
2:45 PM-4:00 PMOUC-B321 - All about Archiving with Microsoft Lync Server 2013 Jason Collier
OUC-B222 - Introducing Lync Room System David Groom
OUC-B306 - Exchange Online Protection Wendy Wilkes
OUC-B341 - Microsoft Office 365 Directory and Access Management with Windows Azure Active Directory Ross Adams, Paul Andrew&Jono Luk

You can use the Schedule Builder on the TechEd web site to select the sessions you want to attend and sync session info with your Outlook calendar (and have the info handy on your mobile device). For more info, head over to the TechEd North America 2013 web site.

If you’re attending, swing by the Micosoft Office booths to meet Exchange, SharePoint & Office team folks. We’d love to hear from you and answer your Exchange-related questions.

Microsoft TechEd 2013 TLC Floor map

Also check out the following posts from our friends in the Office team:

We look forward to seeing you in New Orleans next week!

Exchange Team

The Hybrid Free Busy Troubleshooter Now Available

$
0
0

As customers move their organization into the Cloud or choose to coexist, there is a need to ensure that some of the basic functionalities users have grown accustomed to, continue to work. While some of you will move all of the users in a cutover fashion which reduces complexity, others will choose a more gradual approach. This troubleshooter is for administrators that have chosen the hybrid approach.

Are you seeing the hash marks in your hybrid Exchange environment as depicted below and want to get rid of them? Then this troubleshooter is for you.

image

The reason we focused on a troubleshooter for Free Busy is because it is the most commonly used “feature set” in a hybrid deployment. If you were to resolve issues with Free Busy lookups, many of the other potential issues you have with your hybrid deployment would be resolved as well.

What is a Hybrid Deployment?

A Hybrid Deployment consists of an on-premises Exchange server environment that has at least one Exchange 2010 or Exchange 2013 server. In this environment there is also a DirSync (Directory Synchronization) server, and in many cases, a deployment of ADFS (Active Directory Federation Services) to provide single sign-on capabilities to the users.

The idea of the hybrid environment is to allow two separate organizations (Exchange Online and Exchange On-Premises) to feel like one organization. To accomplish this, we rely on a token authorization process that is made possible through a combination of Organizations Relationships and Federation Trusts with the Microsoft Federation Gateway.

When this is configured properly, you can do basic things like redirect OWA requests to their proper destination, see “MailTips” for a user, and of course the most common feature, view availability information for another user cross-premises.

To read more about Hybrid Deployments click here.

This sounds hard to configure. How can I avoid issues?

If you are the type that does not like running into issues you can attempt to avoid them, all you have to do is deploy using the Hybrid Configuration Wizard and the Exchange Deployment Assistant. These tools have been designed to get you into an optimal Hybrid configuration which should limit the amount of issues you face. However, with all of the moving parts involved and numerous variants in the on-premises deployments you could still run into issues.

You may ask, “Why do I need a troubleshooter? I use Bing or I get Scroogled.”

When working with customers and engineers, we have found that the troubleshooting steps that need to be followed are not very clear. There is confusion on what steps are applicable when free busy works in one direction (Cloud to on-premises), but not in the other (on-premises to Cloud). While searching Bing for answerscan definitely lead to a solution, we believe we can be more expedient by using the troubleshooter to target solutions at your specific symptom.

The troubleshooter can be found here or at the following simple URL: http://aka.ms/hybridfreebusy

Thanks to Charlotte Raymundo, Nagesh Mahadev, Edgar Quevedo, Geoffrey Crisp, Star Li and Chen Jiang for their help in creation and review of this troubleshooter.

Timothy Heeney

Per-Server Database Limits Explained

$
0
0

Over the past year, we have discussed the architectural changes that have been introduced in Exchange Server 2013. I wrote about the reduction in complexity that the new server role architecture introduces, as well as, the one of the new capabilities introduced in Exchange 2013, Managed Availability’s recovery oriented computing. However, we haven’t been clear on other architectural changes that have shaped decisions we’ve made about the Exchange 2013 product. For example, the decision on reducing the number of databases supported per-server from 100 to 50. There were three main reasons for this:

  1. Server architecture changes
  2. Use of commodity hardware
  3. Testing

Let me explain each of these in more detail.

Server Architecture

Exchange 2013 includes fundamental changes to the search and store components and data is processed and rendered.

The old content indexing service was replaced with Search Foundation. Search Foundation is an actively developed search platform that used across the Office Server products. Search Foundation allows us to have notification-driven content indexing which improves indexing performance; in addition, we now annotate during transport, reducing the number of times a message must be indexed significantly.

The monolithic store.exe process was re-written; store is now written in managed code and there are now at least three processes that make up the Information Store service: The Microsoft Exchange Replication service, the Information Store service process controller, and the Information store worker process. By utilizing the worker process model, each database is now isolated from every other database (e.g., a database crashing due to a malformed message will not bring down the rest of the databases on the server).

In addition, there is a core shift in the server role architecture such that the protocol responsible for servicing a user’s request is the protocol instance that is local to the user’s active mailbox database copy. This means that the Mailbox server role now performs more work when compared to its Exchange 2010 counterpart.

The end result is that with the server architecture changes we introduced in Exchange 2013, search, store, and the protocols typically can be CPU and memory bound, as opposed to disk IO or capacity bound.

Commodity Hardware

As discussed in our server sizing guidance, we are big fans of commodity server hardware. Office 365 is designed to run on commodity hardware that leverages 2 processor sockets and 12 disks – we do not leverage external storage chassis as this increases the operational complexity in the environment. Our Exchange 2013 Mailbox servers have less than 50 database copies per-server in Office 365.

Testing

The last reason as to why we limited support to 50 databases per-server is that we did not have actual deployments at any scale to validate that store, search, the protocols, and Managed Availability could handle 100 databases per-server. Automation and lab testing can only take you so far; the lack of real world usage was one of the key reasons why we chose to limit the database count.

Moving Forward

The Exchange Product Group takes pride in the feedback mechanisms we have invested in with the Exchange community. Since the release of Exchange 2013, we’ve received an inordinate amount of feedback regarding the reduction in supported databases per-server. The driving response has been “we currently deploy more than 50 databases per-server in Exchange 2010; with this change, this means we will need to deploy more servers, which increases our capital expenditures significantly.” Rest assured, that is not the message we want with Exchange 2013. It is true that Exchange 2013 utilizes more CPU and memory than its predecessors – this is due to the architecture changes we’ve made, as well as the changes we’ve made to reduce disk IO, so that you can deploy more mailboxes per disk. But we do not want to see architectures artificially limited by the supported databases per-server constraint.

Over the last several months, we’ve been working to resolve our concerns and improve our test matrices to validate supporting more databases/server.

As a result of the work done by the Mailbox Intelligence team and Operations teams, I am pleased to announce that when Exchange Server 2013 RTM Cumulative Update 2 (CU2) releases we are increasing the number of databases per-server back to 100. Both the Exchange 2013 Server Role Calculator and our sizing guidance will be updated to include this architectural change in tandem with CU2’s release.  CU2 will release later this summer.

As always, we continue to identify ways to better serve your needs through our regular servicing releases. We hope you find this architectural change useful. Please keep the feedback coming, we are listening.

Ross Smith IV
Principal Program Manager
Exchange Customer Experience

Outlook Connectivity to Office 365 Troubleshooter Now Available

$
0
0

It is no secret that if you are an Exchange/Office 365 administrator you will no doubt have to troubleshoot Outlook connectivity at some point. Whether you use Exchange Online, on-premises, or some combination of both, you will inevitably have an issue with Outlook performance, connectivity, profile corruption, or some other unknown Outlook disease before retirement.

To assist you with these issues, we have released a Guided Walk Through (GWT) for troubleshooting Outlook Connectivity issues in Office 365.  There are a couple of ways to access the troubleshooter.  You can access it directly at: http://aka.ms/outlookconnectivity

In addition, it will be embedded in various Outlook connectivity technical resources such as the following:

The purpose of this walk through is to assist you in resolving these complex issues by focusing on the scoping and steps used to isolate and resolve problems. Therefore the walk through starts by focusing on commonly encountered symptoms related to Outlook connectivity.

image

Consider that there might not be a single solution, but a combination of factors contributing to the problem. Following the walk through will allow you to isolate and remedy the most common causes of Outlook connectivity issues to Office 365.

This walk through is not meant to replace all of the data that helps you understand Outlook connectivity issues, but rather quickly give you the steps you need to help find the solution. The walk through focuses on all version of Office 365.

I wanted to thank the people who helped make this a reality. Here are the parties involved (that I am aware of):

Exchange/Outlook support:

  • Kevyn Pietsch
  • Timothy Heeney
  • Nitin Shukla
  • Nagesh Mahadev
  • Jeff Miller
  • Jon Bradley
  • Jeremy Hayes

Documentation / content creation teams:

  • Charlotte Raymundo
  • Serdar Soysal
  • Geoffrey Crisp
  • Star Li
  • Chen Jiang

Nagesh Mahadev

Exchange Server 2013 Architecture Poster PDF Download Available

$
0
0

We just released a downloadable PDF version of the Exchange Server 2013 Architecture Poster. This is the poster that we handed out at the Office booth and in various Exchange 2013 breakout sessions last week at TechEd North America 2013 in New Orleans, LA.  We’ll also be handing out printed copies of the poster at TechEd Europe 2013 in Madrid, Spain in a couple of weeks. While we cannot provide printed copies for everyone, you can download the PDF file and take it to your favorite printer/copy center, and have them print it for you.  It is designed to be printed in 36” x 24” format, as shown below:

ExchangePoster_Final_052313

This poster highlights the significantly updated and modernized architecture in Exchange 2013, and highlights the new technologies in Exchange 2013, such as Managed Availability, the new storage and high availability features, and integration with SharePoint and Lync.  In addition, it illustrates the new transport architecture in Exchange 2013.

We welcome your feedback on the poster.  If you have any, please feel free to send it to  eapf@microsoft.com.

Scott Schnoll

What Did Managed Availability Just Do To This Service?

$
0
0

We in the Exchange product group get this question from time to time. The first thing we ask in response is always, “What was the customer impact?” In some cases, there is customer impact; these may indicate bugs that we are motivated to fix. However, in most cases there was no customer impact: a service restarted, but no one noticed. We have learned while operating the world’s largest Exchange deployment that it is fantastic when something is fixed before customers even notice. This is so desirable that we are willing to have a few extra service restarts as long as no customers are impacted.

You can see this same philosophy at work in our approach to database failovers since Exchange 2007. The mantra we have come to repeat is, “Stuff breaks, but the user experience doesn’t!” User experience is our number one priority at all times. Individual service uptime on a server is a less important goal, as long as the user experience remains satisfactory.

However, there are cases where Managed Availability cannot fix the problem. In cases like these, Exchange provides a huge amount of information about what the problem might be. Hundreds of things are checked and tested every minute. Usually, Get-HealthReport and Get-ServerHealth will be sufficient to find the problem, but this blog post will walk you through getting the full details from an automatic recovery action to the results of all the probes by:

  1. Finding the Managed Availability Recovery Actions that have been executed for a given service.
  2. Determining the Monitor that triggered the Responder.
  3. Retrieving the Probes that the Monitor uses.
  4. Viewing any error messages from the Probes.

Finding Recovery Actions

Every time Managed Availability takes a recovery action, such as restarting a service or failing over a database, it logs an event in the Microsoft.Exchange.ManagedAvailability/RecoveryActions crimson channel. Event 500 indicates that a recovery action has begun. Event 501 indicates that the action that was taken has completed. These can be collected via the MMC Event Viewer, but we usually find it more useful to use PowerShell. All of these Managed Availability recovery actions can be collected in PowerShell with a simple command:

$RecoveryActionResultsEvents = Get-WinEvent –ComputerName <Server> -LogName Microsoft-Exchange-ManagedAvailability/RecoveryActionResults

We can use the events in this format, but it is easier to work with the event properties if we use PowerShell’s native XML format:

$RecoveryActionResultsXML = ($RecoveryActionResultsEvents | Foreach-object -Process {[XML]$_.toXml()}).event.userData.eventXml

Some of the useful properties for this Recovery Action event are:

  • Id: The action that was taken. Common values are RestartService, RecycleAppPool, ComponentOffline, or ServerFailover.
  • State: Whether the action has started (event 500) or finished (event 501).
  • ResourceName: The object that was affected by the action. This will be the name of a service for RestartService actions, or the name of a server for server-level actions.
  • EndTime: The time the action completed.
  • Result: Whether the action succeeded or not.
  • RequestorName: The name of the Responder that took the action.

So for example, if you wanted to know why MSExchangeRepl was restarted on your server around 9:30PM, you could run a command like this:

$RecoveryActionResultsXML | Where-Object {$_.State -eq "Finished" -and $_.ResourceName –eq "MSExchangeRepl" -and $_.EndTime -like "2013-06-12T21*"}| ft -AutoSize StartTime,RequestorName

This results in the following output:

StartTime

RequestorName

---------

-------------

2013-05-12T21:49:18.2113618Z

ServiceHealthMSExchangeReplEndpointRestart

The RequestorName property indicates the name of the Responder that took the action. In this case, it was ServiceHealthMSExchangeReplEndpointRestart. Often, the responder name will give you an indication of the problem. Other times, you will want more details.

Finding the Monitor that Triggers a Responder

Monitors are the central part of Managed Availability. They are the primary means, through Get-ServerHealth and Get-HealthReport, by which an administrator can learn the health of a server. Recall that a Health Set is a grouping of related Monitors. This is why much of our troubleshooting documentation is focused on these objects. It will often be useful to know what Monitors and Health Sets are repeatedly unhealthy in your environment.

Every time the Health Manager service starts, it logs events to the Microsoft.Exchange.ActiveMonitoring/ResponderDefinition crimson channel, which we can use to get the properties of the Responders we found in the last step by the RequestorName property. First, we need to collect the Responders that are defined:

$DefinedResponders = (Get-WinEvent –ComputerName <Server> -LogName Microsoft-Exchange-ActiveMonitoring/ResponderDefinition | % {[xml]$_.toXml()}).event.userData.eventXml

One of these Responder Definitions will match the Recovery Action’s RequestorName. The Monitor that controls the Responder we are interested in is defined by the AlertMask property of that Definition. Here are some of the useful Responder Definition properties:

  • TypeName: The full code name of the recovery action that will be taken when this Responder executes.
  • Name: The name of the Responder.
  • TargetResource: The object this Responder will act on.
  • AlertMask: The Monitor for this Responder.
  • WaitIntervalSeconds: The minimum amount of time to wait before this Responder can be executed again. There are other forms of throttling that will also affect this Responder.

To get the Monitor for the ServiceHealthMSExchangeReplEndpointRestart Responder, you run:

$DefinedResponders | ? {$_.Name –eq "ServiceHealthMSExchangeReplEndpointRestart"} | ft -a Name,AlertMask

This results in the following output:

Name

AlertMask

----

---------

ServiceHealthMSExchangeReplEndpointRestart

ServiceHealthMSExchangeReplEndpointMonitor

Many Monitor names will give you an idea of what to look for. In this case, the ServiceHealthMSExchangeReplEndpointMonitor Monitor does not tell you much more than the Responder name did. The Technet article on Troubleshooting DataProtection Health Set lists this Monitor and suggests running Test-ReplicationHealth. However, you can also get the exact error messages of the Probes for this Monitor with a couple more commands.

Finding the Probes for a Monitor

Remember that Monitors have their definitions written to the Microsoft.Exchange.ActiveMonitoring/MonitorDefinition crimson channel. Thus, you can get these in a similar way as the Responder definitions in the last step. You can run:

$DefinedMonitors = (Get-WinEvent –ComputerName <Server> -LogName Microsoft-Exchange-ActiveMonitoring/MonitorDefinition | % {[xml]$_.toXml()}).event.userData.eventXml

Some useful properties of a Monitor definition are:

  • Name: The name of this Monitor. This is the same name reported by Get-ServerHealth.
  • ServiceName: The name of the Health Set for this Monitor.
  • SampleMask: The substring that all Probes for this Monitor will have in their names.
  • IsHaImpacting: Whether this Monitor should be included when HaImpactingOnly is specified by Get-ServerHealth or Get-HealthReport.

To get the SampleMask for the identified Monitor, you can run:

($DefinedMonitors | ? {$_.Name -eq ‘ServiceHealthMSExchangeReplEndpointMonitor’}).SampleMask

This results in the following output:

ServiceHealthMSExchangeReplEndpointProbe

 

Now that we know what Probes to look for, we can search the Probes’ definition channel. Useful properties for Probe Definitions are:

  • Name: The name of the Probe. This will begin with the SampleMask of the Probe’s Monitor.
  • ServiceName: The Health Set for this Probe.
  • TargetResource: The object this Probe is validating. This is appended to the Name of the Probe when it is executed to become a Probe Result ServiceName.
  • RecurrenceIntervalSeconds: How often this Probe executes.
  • TimeoutSeconds: How long this Probe should wait before failing.

To get definitions of this Monitor’s Probes, you can run:

(Get-WinEvent –ComputerName <Server> -LogName Microsoft-Exchange-ActiveMonitoring/ProbeDefinition | % {[XML]$_.toXml()}).event.userData.eventXml | ? {$_.Name -like “ServiceHealthMSExchangeReplEndpointProbe*”} | ft -a Name, TargetResource

This results in the following output:

Name

TargetResource

----

--------------

ServiceHealthMSExchangeReplEndpointProbe/ServerLocator

MSExchangeRepl

ServiceHealthMSExchangeReplEndpointProbe/RPC

MSExchangeRepl

ServiceHealthMSExchangeReplEndpointProbe/TCP

MSExchangeRepl

Remember, not all Monitors use synthetic transactions via Probes. See this blog post for the other ways Monitors collect their information.

This Monitor has three Probes that can cause it to become Unhealthy. You’ll see that they are named such that each is named with the Monitor’s SampleMask, but are then differentiated. When getting the Probe Results in the next step, the Probes will also have the TargetResource in their ServiceName.

Now that we know all the Probes that could have failed, but we don’t yet know which did or why.

Getting Probe Error Messages

There are many Probes and they execute often, so the channel where they are logged (Microsoft.Exchange.ActiveMonitoring/ProbeResult) generates a lot of data. There will often only be a few hours of data, but the Probes we are interested in will probably have a few hundred Result entries. Here are some of the Probe Result properties you may be interested in for troubleshooting:

  • ServiceName: The Health Set of this Probe.
  • ResultName: The Name of this Probe, including the Monitor’s SampleMask, an identifier of the code this Probe executes, and the resource it verifies. The target resource is appended to the Probe’s name we found in the previous step. In this example, we append /MSExchangeRepl to get ServiceHealthMSExchangeReplEndpointProbe/RPC/MSExchangeRepl.
  • Error: The error returned by this Probe, if it failed.
  • Exception: The callstack of the error, if it failed.
  • ResultType: An integer that indicates one of these values:
    • 1: Timeout
    • 2: Poisoned
    • 3: Succeeded
    • 4: Failed
    • 5: Quarantined
    • 6: Rejected
  • ExecutionStartTime: When the Probe started.
  • ExecutionEndTime: When the Probe completed.
  • ExecutionContext: Additional information about the Probe’s execution.
  • FailureContext: Additional information about the Probe’s failure.

Some Probes may use some of the other available fields to provide additional data about failures.

We can use XPath to filter the large number of events to just the ones we are interested in; those with the ResultName we identified in the last step and with a ResultType of 4 indicating that they failed:

$replEndpointProbeResults = (Get-WinEvent –ComputerName <Server> -LogName Microsoft-Exchange-ActiveMonitoring/ProbeResult -FilterXPath "*[UserData[EventXML[ResultName='ServiceHealthMSExchangeReplEndpointProbe/RPC/MSExchangeRepl'][ResultType='4']]]" | % {[XML]$_.toXml()}).event.userData.eventXml

To get a nice graphical view of the Probe’s errors, you can run:

$replEndpointProbeResults | select -Property *Time,Result*,Error*,*Context,State* | Out-GridView

image

In this case, the full error message for both Probe Results suggests making sure the MSExchangeRepl service is running. This actually is the problem, as for this scenario I restarted the service manually.

Summary

This article is a detailed look at how you have access to an incredible amount of information about the health of Exchange Servers.  Hopefully, you will not often need it! In most cases, the alerts will be enough notification and the included cmdlets will be sufficient for investigation.

Managed Availability is built and hardened at scale, and we continuously analyze these same events collected in this article so that we can either fix root causes or write Responders to fix more problems before users are impacted. In those cases where you do need to investigate a problem in detail, we hope this post is a good starting point.

Abram Jackson

Updated: Exchange Server 2013 Deployment Assistant

$
0
0

We’re happy to announce updates to the Exchange Server 2013 Deployment Assistant!

We’ve updated the Deployment Assistant to include the following new scenarios:

  • Upgrading from Exchange 2007 to Exchange 2013
  • Upgrading from Exchange 2010 to Exchange 2013
  • Configuring an Exchange 2013-based hybrid deployment for Exchange 2007 organizations

These new scenarios provide step-by-step guidance about how to upgrade your existing Exchange 2007 or Exchange 2010 organizations to benefit from the improvements and new features of Exchange 2013. Plus, Exchange 2007 organizations can now configure a hybrid deployment with Office 365 using Exchange 2013 instead of Exchange 2010 SP3 in their on-premises organization.

And, there’s more on the way! We’re also working hard on additional scenarios, such as upgrading from a mixed Exchange Server 2007/2010 organization to Exchange 2013 and configuring Exchange 2013-based hybrid for Exchange 2010 organizations. Keep checking back here for release announcements.

In case you're not familiar with it, the Exchange Server 2013 Deployment Assistant is a web-based tool that helps you deploy Exchange 2013 in your on-premises organization, configure a hybrid deployment between your on-premises organization and Office 365, or migrate to Office 365. The tool asks you a small set of simple questions and then, based on your answers, creates a customized checklist with instructions to deploy or configure Exchange 2013. Instead of trying to find what you need in the Exchange library, the Deployment Assistant gives you exactly the right information you need to complete your task. Supported on most major browsers, the Deployment Assistant is your one-stop shop for deploying Exchange 2013.

The updated Exchange 2013 Deployment Assistant
Figure 1: The updated Exchange 2013 Deployment Assistant (large screenshot)

And for those organizations that still need to deploy Exchange 2010 or are interested in configuring an Exchange 2010-based hybrid deployment with Office 365, you can continue to access the Exchange Server 2010 Deployment Assistant at http://technet.microsoft.com/exdeploy2010(short URL: aka.ms/eda2010).

Do you have a deployment success story about the Deployment Assistant? Do you have suggestions on how to improve the tool? We would love your feedback and comments! Feel free to leave a comment here, or send an email to edafdbk@microsoft.com directly or via the 'Feedback' link located in the header of every page of the Deployment Assistant.

Happy deploying!

The Deployment Assistant Team


Exchange 2010 Database Availability Groups and Disk Sector Sizes

$
0
0

These days, some customers are deploying Exchange databases and log files on advanced format (4K) drives.  Although these drives support a physical sector size of 4096, many vendors are emulating 512 byte sectors in order to maintain backwards compatibility with application and operating systems.  This is known as 512 byte emulation (512e).  Windows 2008 and Windows 2008 R2 support native 512 byte and 512 byte emulated advanced format drives.  Windows 2012 supports drives of all sector sizes.  The sector size presented to applications and the operating system, and how applications respond, directly affects data integrity and performance.

For more information on sector sizes see the following links:

When deploying an Exchange 2010 Database Availability Group (DAG), the sector sizes of the volumes hosting the databases and log files must be the same across all nodes within the DAG.  This requirement is outlined in Understanding Storage Configuration.

Support requires that all copies of a database reside on the same physical disk type. For example, it is not a supported configuration to host one copy of a given database on a 512-byte sector disk and another copy of that same database on a 512e disk. Also be aware that 4-kilobyte (KB) sector disks are not supported for any version of Microsoft Exchange and 512e disks are not supported for any version of Exchange prior to Exchange Server 2010 SP1.

Recently, we have noted that some customers have experienced issues with log file replication and replay as the result of sector size mismatch.  These issues occur when:

  • Storage drivers are upgraded resulting in the recognized sector size changing.
  • Storage firmware is upgraded resulting in the recognized sector size changing.
  • New storage is presented or existing storage is replaced with drives of a different sector size.

This mismatch can cause one or more database copies in a DAG to fail, as illustrated below. In my example environment, I have a three-member DAG with a single database that resides on a volume labeled Z that is replicated between each member.

[PS] C:\>Get-MailboxDatabaseCopyStatus *

Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex Length Length State
---- ------ --------- ----------- -------------------- ------------
SectorTest\MBX-1 Mounted 0 0 Healthy
SectorTest\MBX-2 Healthy 0 1 3/19/2013 10:27:50 AM Healthy
SectorTest\MBX-3 Healthy 0 1 3/19/2013 10:27:50 AM Healthy

If I use FSUTIL to query the Z volume on each DAG member, we can see that the volume currently has 512 logical bytes per sector and a 512 physical bytes per sector. Thus, the the volume is currently seen by the operating system as having a native 512 byte sector size.

On MBX-1:

C:\>fsutil fsinfo ntfsinfo z:

NTFS Volume Serial Number :       0x18d0bc1dd0bbfed6
Version :                         3.1
Number Sectors :                  0x000000000fdfe7ff
Total Clusters :                  0x0000000001fbfcff
Free Clusters  :                  0x0000000001fb842c
Total Reserved :                  0x0000000000000000
Bytes Per Sector  :               512
Bytes Per Physical Sector :       512

Bytes Per Cluster :               4096
Bytes Per FileRecord Segment    : 1024
Clusters Per FileRecord Segment : 0
Mft Valid Data Length :           0x0000000000040000
Mft Start Lcn  :                  0x00000000000c0000
Mft2 Start Lcn :                  0x0000000000000002
Mft Zone Start :                  0x00000000000c0040
Mft Zone End   :                  0x00000000000cc840
RM Identifier:        EF486117-9094-11E2-BF55-00155D006BA1

On MBX-3:

C:\>fsutil fsinfo ntfsinfo z:

NTFS Volume Serial Number :       0x0ad44aafd44a9d37
Version :                         3.1
Number Sectors :                  0x000000000fdfe7ff
Total Clusters :                  0x0000000001fbfcff
Free Clusters  :                  0x0000000001fad281
Total Reserved :                  0x0000000000000000
Bytes Per Sector  :               512
Bytes Per Physical Sector :       512

Bytes Per Cluster :               4096
Bytes Per FileRecord Segment    : 1024
Clusters Per FileRecord Segment : 0
Mft Valid Data Length :           0x0000000000040000
Mft Start Lcn  :                  0x00000000000c0000
Mft2 Start Lcn :                  0x0000000000000002
Mft Zone Start :                  0x00000000000c0000
Mft Zone End   :                  0x00000000000cc820
RM Identifier:        B9B00E32-90B2-11E2-94E9-00155D006BA3

Effects of storage changes

But what happens if there is a change in the way storage is seen on MBX-3, so that the volume now reflects a 512e sector size.  This can happen when upgrading storage drivers, upgrading firmware, or presenting new storage that implements advanced format storage.

C:\>fsutil fsinfo ntfsinfo z:

NTFS Volume Serial Number :       0x0ad44aafd44a9d37
Version :                         3.1
Number Sectors :                  0x000000000fdfe7ff
Total Clusters :                  0x0000000001fbfcff
Free Clusters  :                  0x0000000001fad2e7
Total Reserved :                  0x0000000000000000
Bytes Per Sector  :               512
Bytes Per Physical Sector :       4096

Bytes Per Cluster :               4096
Bytes Per FileRecord Segment    : 1024
Clusters Per FileRecord Segment : 0
Mft Valid Data Length :           0x0000000000040000
Mft Start Lcn  :                  0x00000000000c0000
Mft2 Start Lcn :                  0x0000000000000002
Mft Zone Start :                  0x00000000000c0040
Mft Zone End   :                  0x00000000000cc840
RM Identifier:        B9B00E32-90B2-11E2-94E9-00155D006BA3

When reviewing the database copy status, notice that the copy assigned to MBX-3 has failed.

[PS] C:\>Get-MailboxDatabaseCopyStatus *

Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex Length Length State
---- ------ --------- ----------- -------------------- ------------
SectorTest\MBX-1 Mounted 0 0 Healthy
SectorTest\MBX-2 Healthy 0 0 3/19/2013 11:13:05 AM Healthy
SectorTest\MBX-3 Failed 0 8 3/19/2013 11:13:05 AM Healthy

The full details of the copy status of MBX-3 can be reviewed to display the detailed error:

[PS] C:\>Get-MailboxDatabaseCopyStatus SectorTest\MBX-3 | fl

RunspaceId                       : 5f4bb58b-39fb-4e3e-b001-f8445890f80a
Identity                         : SectorTest\MBX-3
Name                             : SectorTest\MBX-3
DatabaseName                     : SectorTest
Status                           : Failed
MailboxServer                    : MBX-3
ActiveDatabaseCopy               : mbx-1
ActivationSuspended              : False
ActionInitiator                  : Service
ErrorMessage                     : The log copier was unable to continue processing for database 'SectorTest\MBX-3' because an error occurred on the target server: Continuous replication - block mode has been terminated. Error: the log file sector size does not match the current volume's sector size (-546) [HResult: 0x80131500]. The copier will automatically retry after a short delay.
ErrorEventId                     : 2152
ExtendedErrorInfo                :
SuspendComment                   :
SinglePageRestore                : 0
ContentIndexState                : Healthy
ContentIndexErrorMessage         :
CopyQueueLength                  : 0
ReplayQueueLength                : 7
LatestAvailableLogTime           : 3/19/2013 11:13:05 AM
LastCopyNotificationedLogTime    : 3/19/2013 11:13:05 AM
LastCopiedLogTime                : 3/19/2013 11:13:05 AM
LastInspectedLogTime             : 3/19/2013 11:13:05 AM
LastReplayedLogTime              : 3/19/2013 10:24:24 AM
LastLogGenerated                 : 53
LastLogCopyNotified              : 53
LastLogCopied                    : 53
LastLogInspected                 : 53
LastLogReplayed                  : 46
LogsReplayedSinceInstanceStart   : 0
LogsCopiedSinceInstanceStart     : 0
LatestFullBackupTime             :
LatestIncrementalBackupTime      :
LatestDifferentialBackupTime     :
LatestCopyBackupTime             :
SnapshotBackup                   :
SnapshotLatestFullBackup         :
SnapshotLatestIncrementalBackup  :
SnapshotLatestDifferentialBackup :
SnapshotLatestCopyBackup         :
LogReplayQueueIncreasing         : False
LogCopyQueueIncreasing           : False
OutstandingDumpsterRequests      : {}
OutgoingConnections              :
IncomingLogCopyingNetwork        :
SeedingNetwork                   :
ActiveCopy                       : False

Using the Exchange Server Error Code Look-up tool (ERR.EXE), we can verify the definition of the error code –546.

D:\Utilities\ERR>err -546

# for decimal -546 / hex 0xfffffdde
  JET_errLogSectorSizeMismatch                                   esent98.h
# /* the log file sector size does not match the current
# volume's sector size */
# 1 matches found for "-546"

In addition, the Application event log may contain the following entries:

Log Name:      Application
Source:        MSExchangeRepl
Date:          3/19/2013 11:14:58 AM
Event ID:      2152
Task Category: Service
Level:         Error
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
The log copier was unable to continue processing for database 'SectorTest\MBX-3' because an error occured on the target server: Continuous replication - block mode has been terminated. Error: the log file sector size does not match the current volume's sector size (-546) [HResult: 0x80131500]. The copier will automatically retry after a short delay.

The cause

Why does this issue occur?
Each log file records in the header the sector size of the disk where a log file was created.  For example, this is the header of a log file on MBX-1 with a native 512 byte sector size:

Z:\SectorTest>eseutil /ml E0100000001.log

Extensible Storage Engine Utilities for Microsoft(R) Exchange Server
Version 14.02
Copyright (C) Microsoft Corporation. All Rights Reserved.
Initiating FILE DUMP mode... 
      Base name: E01
      Log file: E0100000001.log
      lGeneration: 1 (0x1)
      Checkpoint: (0x38,FFFF,FFFF)
      creation time: 03/19/2013 09:40:14
      prev gen time: 00/00/1900 00:00:00
      Format LGVersion: (7.3704.16.2)
      Engine LGVersion: (7.3704.16.2)
      Signature: Create time:03/19/2013 09:40:14 Rand:11019164 Computer:
      Env SystemPath: z:\SectorTest\
      Env LogFilePath: z:\SectorTest\
     Env Log Sec size: 512 (matches)
      Env (CircLog,Session,Opentbl,VerPage,Cursors,LogBufs,LogFile,Buffers)
          (    off,   1227,  61350,  16384,  61350,   2048,   2048,  44204)
      Using Reserved Log File: false
      Circular Logging Flag (current file): off
      Circular Logging Flag (past files): off
      Checkpoint at log creation time: (0x1,8,0) 
      Last Lgpos: (0x1,A,0)
Number of database page references:  0
Integrity check passed for log file: E0100000001.log
Operation completed successfully in 0.62 seconds.

The sector size that is chosen is determined through one of two methods:

  • If the log stream is brand new, read the sector size from disk and utilize this sector size.
  • If the log stream already exists, use the sector size of the given log stream.

In theory, since the sector size of disks should not be changing across nodes and the sector size of all disks must match, this should not cause a problem.  In our example, and in some customer environments, these sector sizes are actually changing.  Since most of these databases already exist, the existing sector size of the log stream is utilized, which in turn causes a mismatch between DAG members.

When a mismatch occurs, the issue only prevents the successful use of block mode replication.  It does not affect file mode replication.  Block mode replication was introduced in Exchange 2010 Service Pack 1.  For more information on block mode replication, see New High Availability and Site Resilience Functionality in Exchange 2010 SP1.

Why does this only affect block mode replication?
When a log file is addressed we reference locations within a log file based off a log file position.  The log file position is a combination of the log generation, the sector, and offset within that sector.  For example, in the previous header dump you can see the “Last LGPOS” is (0x1,A,0) – this just happens to be the last log file position within the log.  Let us say we were creating a block for block mode replication within a log file generation 0x1A, sector 8, offset 1 – this would be reflected as an LGPOS of (0x1a,8,1).  When this block is transmitted to a host with an advanced sector size disk, the log position would actually have to be translated.  On an advanced format disk this same log position would be (0x1a,1,1).  As you can see, it could create significant problems if incorrect positions within a log file were written to or read from.

The resolution

How do I go about correcting this condition?
To fix this condition, first ensure that the same sector sizes exist on all disks across all nodes that host Exchange data, and then reset the log stream.

The following steps can show you how to do this with minimal downtime.

  1. Ensure that Exchange 2010 Service Pack 2 or later is installed on all DAG members.

    Note: Exchange 2010 Service Pack 1 and earlier do not support 512e volumes).

  2. Disable block mode replication on all hosts.  This step requires restarting the replication service on each node.  This will temporarily cause all copies to fail on passive nodes when the service is restarted on the active node.  When the service is restarted on the passive node only passive copies on that node will enter a failed state.  Databases that are mounted and client connections are not impacted by this activity.  Block mode replication should remain disabled until all steps have been completed on all DAG members.
    1. Launch registry editor.
    2. Navigate to HKLM\Software\Microsoft\ExchangeServer\V14\Replay\Parameters
    3. Right click in the parameters key and select New–> DWORD
    4. The name for the DWORD is DisableGranularReplication
    5. The value for the DWORD is 1
  3. Restart the Microsoft Exchange Replication service on each member using the Shell: Restart-Service MSExchangeRepl

  4. Validate that all copies of databases across DAG members are healthy at this time:

    [PS] C:\>Get-MailboxDatabaseCopyStatus *

    Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex Length Length State
    ---- ------ --------- ----------- -------------------- ------------
    SectorTest\MBX-1 Mounted 0 0 Healthy
    SectorTest\MBX-2 Healthy 0 0 3/19/2013 12:28:34 PM Healthy
    SectorTest\MBX-3 Healthy 0 0 3/19/2013 12:28:34 PM Healthy

  5. Apply the appropriate hotfix for Windows Server 2008 or Windows Server 2008 R2 and Advanced Format Disks.  Windows Server 2012 does not require a hotfix.

    • Windows 2008 R2:KB 982018 An update that improves the compatibility of Windows 7 and Windows Server 2008 R2 with Advanced Format Disks is available
    • Windows 2008:KB 2553708 A hotfix rollup that improves Windows Vista and Windows Server 2008 compatibility with Advanced Format disks
  6. Repeat the procedure that caused the disk sector size to change.  For example, if the issue arose as a result of upgrading drivers and firmware on a host utilize your maintenance mode procedures to complete the driver and firmware upgrade on all hosts.

    Note: If your installation does not allow for you to use the same sector sizes across all DAG members, then the implementation is not supported.

  7. Utilize FSUTIL to ensure that the sector sizes match across all hosts for the log and database volumes. 

    On MBX-1:

    C:\>fsutil fsinfo ntfsinfo z:

    NTFS Volume Serial Number :       0x18d0bc1dd0bbfed6
    Version :                         3.1
    Number Sectors :                  0x000000000fdfe7ff
    Total Clusters :                  0x0000000001fbfcff
    Free Clusters  :                  0x0000000001fac6e6
    Total Reserved :                  0x0000000000000000
    Bytes Per Sector  :               512
    Bytes Per Physical Sector :       4096

    Bytes Per Cluster :               4096
    Bytes Per FileRecord Segment    : 1024
    Clusters Per FileRecord Segment : 0
    Mft Valid Data Length :           0x0000000000040000
    Mft Start Lcn  :                  0x00000000000c0000
    Mft2 Start Lcn :                  0x0000000000000002
    Mft Zone Start :                  0x00000000000c0040
    Mft Zone End   :                  0x00000000000cc840
    RM Identifier:        EF486117-9094-11E2-BF55-00155D006BA1

    On MBX-2

    C:\>fsutil fsinfo ntfsinfo z:

    NTFS Volume Serial Number :       0xfa6a794c6a790723
    Version :                         3.1
    Number Sectors :                  0x000000000fdfe7ff
    Total Clusters :                  0x0000000001fbfcff
    Free Clusters  :                  0x0000000001fac86f
    Total Reserved :                  0x0000000000000000
    Bytes Per Sector  :               512
    Bytes Per Physical Sector :       4096

    Bytes Per Cluster :               4096
    Bytes Per FileRecord Segment    : 1024
    Clusters Per FileRecord Segment : 0
    Mft Valid Data Length :           0x0000000000040000
    Mft Start Lcn  :                  0x00000000000c0000
    Mft2 Start Lcn :                  0x0000000000000002
    Mft Zone Start :                  0x00000000000c0040
    Mft Zone End   :                  0x00000000000cc840
    RM Identifier:        5F18A2FC-909E-11E2-8599-00155D006BA2

    On MBX-3

    C:\>fsutil fsinfo ntfsinfo z:

    NTFS Volume Serial Number :       0x0ad44aafd44a9d37
    Version :                         3.1
    Number Sectors :                  0x000000000fdfe7ff
    Total Clusters :                  0x0000000001fbfcff
    Free Clusters  :                  0x0000000001fabfd6
    Total Reserved :                  0x0000000000000000
    Bytes Per Sector  :               512
    Bytes Per Physical Sector :       4096

    Bytes Per Cluster :               4096
    Bytes Per FileRecord Segment    : 1024
    Clusters Per FileRecord Segment : 0
    Mft Valid Data Length :           0x0000000000040000
    Mft Start Lcn  :                  0x00000000000c0000
    Mft2 Start Lcn :                  0x0000000000000002
    Mft Zone Start :                  0x00000000000c0040
    Mft Zone End   :                  0x00000000000cc840
    RM Identifier:        B9B00E32-90B2-11E2-94E9-00155D006BA3

At this point, the DAG should be stable, and replication should be occurring as expected between databases using file mode. In order to restore block mode replication and fully recognize the new disk sector sizes, the log stream must be reset.

IMPORTANT: Please note the following about resetting the log stream:

  • The log stream must be fully reset on all database copies.
  • All lagged database copies must be replayed to current log.
  • If backups are utilized as a recovery method this will introduce a gap in the log file sequence preventing  a full roll forward recovery from the last backup point.

You can use the following steps to reset the log stream:

  1. Validate the existence of a replay queue:

    [PS] C:\>Get-MailboxDatabaseCopyStatus *

    Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex Length Length State
    ---- ------ --------- ----------- -------------------- ------------
    SectorTest\MBX-1 Mounted 0 0 Healthy
    SectorTest\MBX-2 Healthy 0 0 3/19/2013 1:34:37 PM Healthy
    SectorTest\MBX-3 Healthy 0 138 3/19/2013 1:34:37 PM Healthy

  2. Set the replay and truncation lag times values to 0 on all database copies. This will ensure that logs replay to current while allowing the databases to remain online. In this example, MBX-3 is a lagged copy database. When the configuration change is detected, log replay will occur allowing the lagged copy to eventually catch up. Note that depending on the replay lag time, this could take several hours before proceeding to next steps.

    [PS] C:\>Set-MailboxDatabaseCopy SectorTest\MBX-3 -ReplayLagTime 0.0:0:0 -TruncationLagTime 0.0:0:0

    Validate that the replay queue has caught up and is near zero.

    [PS] C:\>Get-MailboxDatabaseCopyStatus *

    Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex Length Length State
    ---- ------ --------- ----------- -------------------- ------------
    SectorTest\MBX-1 Mounted 0 0 Healthy
    SectorTest\MBX-2 Healthy 0 0 3/19/2013 1:34:37 PM Healthy
    SectorTest\MBX-3 Healthy 0 0 3/19/2013 1:34:37 PM Healthy

  3. Dismount the database.

    CAUTION: Dismounting the database will cause a client interruption, which will continue until the database is mounted.

    [PS] C:\>Dismount-Database SectorTest

    Confirm
    Are you sure you want to perform this action?
    Dismounting database "SectorTest". This may result in reduced availability for mailboxes in the database.
    [Y] Yes  [A] Yes to All  [N] No  [L] No to All  [?] Help (default is "Y"): y
    [PS] C:\>Get-MailboxDatabaseCopyStatus SectorTest\*
    Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex Length Length State
    ---- ------ --------- ----------- -------------------- ------------
    SectorTest\MBX-1 Dismounted 0 0 Healthy
    SectorTest\MBX-2 Healthy 0 0 3/25/2013 5:41:54 AM Healthy
    SectorTest\MBX-3 Healthy 0 0 3/25/2013 5:41:54 AM Healthy

  4. On each DAG member hosting a database copy, open a command prompt and navigate to the log file directory. Execute eseutil /r ENN to perform a soft recovery. This step is necessary to ensure that all log files are played into all copies.

    Z:\SectorTest>eseutil /r e01

    Extensible Storage Engine Utilities for Microsoft(R) Exchange Server
    Version 14.02
    Copyright (C) Microsoft Corporation. All Rights Reserved.
    Initiating RECOVERY mode...
        Logfile base name: e01
                Log files: <current directory>
             System files: <current directory>
    Performing soft recovery...
                          Restore Status (% complete) 
              0    10   20   30   40   50   60   70   80   90  100
              |----|----|----|----|----|----|----|----|----|----|
              ...................................................
    Operation completed successfully in 0.203 seconds.

  5. On each DAG member hosting a database copy open a command prompt and navigate to the database directory. Execute eseutil /mh <EDB> against the database to dump the header. You must validate that the following information is correct on all database copies:

    • All copies of the database show in clean shutdown.
    • All copies of the database show the same last detach information.
    • All copies of the database show the same last consistent information.

    Here is example output of a full /mh dump followed by a comparison of the data across our three sample copies.

    Z:\SectorTest>eseutil /mh SectorTest.edb

    Extensible Storage Engine Utilities for Microsoft(R) Exchange Server
    Version 14.02
    Copyright (C) Microsoft Corporation. All Rights Reserved.
    Initiating FILE DUMP mode...
             Database: SectorTest.edb
    DATABASE HEADER:
    Checksum Information:
    Expected Checksum: 0x010f4400
      Actual Checksum: 0x010f4400
    Fields:
            File Type: Database
             Checksum: 0x10f4400
       Format ulMagic: 0x89abcdef
       Engine ulMagic: 0x89abcdef
    Format ulVersion: 0x620,17
    Engine ulVersion: 0x620,17
    Created ulVersion: 0x620,17
         DB Signature: Create time:03/19/2013 09:40:15 Rand:11009066 Computer:
             cbDbPage: 32768
               dbtime: 601018 (0x92bba)
    State: Clean Shutdown
         Log Required: 0-0 (0x0-0x0)
        Log Committed: 0-0 (0x0-0x0)
       Log Recovering: 0 (0x0)
      GenMax Creation: 00/00/1900 00:00:00
             Shadowed: Yes
           Last Objid: 3350
         Scrub Dbtime: 0 (0x0)
           Scrub Date: 00/00/1900 00:00:00
         Repair Count: 0
          Repair Date: 00/00/1900 00:00:00
    Old Repair Count: 0
    Last Consistent: (0x138,3FB,1A4)  03/19/2013 13:44:11
          Last Attach: (0x111,9,86)  03/19/2013 13:42:29
    Last Detach: (0x138,3FB,1A4)  03/19/2013 13:44:11
                 Dbid: 1
        Log Signature: Create time:03/19/2013 09:40:14 Rand:11019164 Computer:
           OS Version: (6.1.7601 SP 1 NLS ffffffff.ffffffff)

    Previous Full Backup:
            Log Gen: 0-0 (0x0-0x0)
               Mark: (0x0,0,0)
               Mark: 00/00/1900 00:00:00

    Previous Incremental Backup:
            Log Gen: 0-0 (0x0-0x0)
               Mark: (0x0,0,0)
               Mark: 00/00/1900 00:00:00

    Previous Copy Backup:
            Log Gen: 0-0 (0x0-0x0)
               Mark: (0x0,0,0)
               Mark: 00/00/1900 00:00:00

    Previous Differential Backup:
            Log Gen: 0-0 (0x0-0x0)
               Mark: (0x0,0,0)
               Mark: 00/00/1900 00:00:00

    Current Full Backup:
            Log Gen: 0-0 (0x0-0x0)
               Mark: (0x0,0,0)
               Mark: 00/00/1900 00:00:00

    Current Shadow copy backup:
            Log Gen: 0-0 (0x0-0x0)
               Mark: (0x0,0,0)
               Mark: 00/00/1900 00:00:00 

         cpgUpgrade55Format: 0
        cpgUpgradeFreePages: 0
    cpgUpgradeSpaceMapPages: 0 

           ECC Fix Success Count: none
       Old ECC Fix Success Count: none
             ECC Fix Error Count: none
         Old ECC Fix Error Count: none
        Bad Checksum Error Count: none
    Old bad Checksum Error Count: none 

      Last checksum finish Date: 03/19/2013 13:11:36
    Current checksum start Date: 00/00/1900 00:00:00
          Current checksum page: 0

    Operation completed successfully in 0.47 seconds.

    MBX-1:

    State: Clean Shutdown
    Last Consistent: (0x138,3FB,1A4)  03/19/2013 13:44:11
    Last Detach: (0x138,3FB,1A4)  03/19/2013 13:44:11

    MBX-2:

    State: Clean Shutdown
    Last Consistent: (0x138,3FB,1A4)  03/19/2013 13:44:12
    Last Detach: (0x138,3FB,1A4)  03/19/2013 13:44:12

    MBX-3:

    State: Clean Shutdown
    Last Consistent: (0x138,3FB,1A4)  03/19/2013 13:44:13
    Last Detach: (0x138,3FB,1A4)  03/19/2013 13:44:13

    In this case, the values match across all copies so further steps can be performed.

    If the values do not match across copies for any reason, do not continue and please contact Microsoft support.

  6. Reset the log file generation for the database.

    Note: Use Get-MailboxDatabaseCopyStatus to record database locations and status prior to performing this activity.

    Locate the log file directory for each ACTIVE (DISMOUNTED) database. Remove all log files from this directory first. Failure to remove log files from the ACTIVE (DISMOUNTED) database may result in the Replication service recopying log files, a failure of this procedure, and subsequent need to reseed all database copies.

    IMPORTANT: If log files are located in the same location as the database and catalog data folder, take precautions to not remove the database or the catalog data folder.

    In our example MBX-1 hosts the ACTIVE (DISMOUNTED) copy.

    [PS] C:\>Get-MailboxDatabaseCopyStatus SectorTest\*

    Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex Length Length State
    ---- ------ --------- ----------- -------------------- ------------
    SectorTest\MBX-1 Dismounted 0 0 Healthy
    SectorTest\MBX-2 Healthy 0 0 3/25/2013 5:41:54 AM Healthy
    SectorTest\MBX-3 Healthy 0 0 3/25/2013 5:41:54 AM Healthy

    Locate the log file directory for each PASSIVE database. Remove all log files from this directory. Failure to remove all log files could result in this procedure failing, and the need to reseed this or all database copies. If log files are located in the same location as the database and catalog data folder take precautions to not remove the database or the catalog data folder.

    In our example MBX-2 and MBX-3 host the passive database copies.

    [PS] C:\>Get-MailboxDatabaseCopyStatus SectorTest\*

    Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex Length Length State
    ---- ------ --------- ----------- -------------------- ------------
    SectorTest\MBX-1 Dismounted 0 0 Healthy
    SectorTest\MBX-2 Healthy 0 0 3/25/2013 5:41:54 AM Healthy
    SectorTest\MBX-3 Healthy 0 0 3/25/2013 5:41:54 AM Healthy

  7. Mount the database using Mount-Database <DBNAME>, and verify it has mounted.

    [PS] C:\>Mount-Database SectorTest
    [PS] C:\>Get-MailboxDatabaseCopyStatus *

    Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex Length Length State
    ---- ------ --------- ----------- -------------------- ------------
    SectorTest\MBX-1 Mounted 0 0 Healthy
    SectorTest\MBX-2 Healthy 0 1 3/25/2013 5:57:28 AM Healthy
    SectorTest\MBX-3 Healthy 0 1 3/25/2013 5:57:28 AM Healthy

  8. Suspend and resume all passive database copies.

    Note: The error on suspending the active database copy is expected.

    [PS] C:\>Get-MailboxDatabaseCopyStatus SectorTest\* | Suspend-MailboxDatabaseCopy

    The suspend operation can't proceed because database 'SectorTest' on Exchange Mailbox server 'MBX-1' is the active mailbox database copy.
        + CategoryInfo          : InvalidOperation: (SectorTest\MBX-1:DatabaseCopyIdParameter) [Suspend-MailboxDatabaseCopy], InvalidOperationException
        + FullyQualifiedErrorId : 5083D28B,Microsoft.Exchange.Management.SystemConfigurationTasks.SuspendDatabaseCopy
        + PSComputerName        : mbx-1.exchange.msft

    Note: The error on resuming the active database copy is expected.

    [PS] C:\>Get-MailboxDatabaseCopyStatus SectorTest\* | Resume-MailboxDatabaseCopy

    WARNING: The Resume operation won't have an effect on database replication because database 'SectorTest' hosted on server 'MBX-1' is the active mailbox database.

  9. Validate replication health.

    [PS] C:\>Get-MailboxDatabaseCopyStatus *

    Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex Length Length State
    ---- ------ --------- ----------- -------------------- ------------
    SectorTest\MBX-1 Mounted 0 0 Healthy
    SectorTest\MBX-2 Healthy 0 0 3/19/2013 1:56:12 PM Healthy
    SectorTest\MBX-3 Healthy 0 0 3/19/2013 1:56:12 PM Healthy

  10. Using Set-MailboxDatabaseCopy, reconfigure any replay lag or truncation lag time on the database copy. This example implements a 7 day replay lag time.

    set-mailboxdatabasecopy –identity SectorTest\MBX-3 –replayLagTime 7.0:0:0

  11. Repeat the previous steps for all databases in the DAG including those databases that have a single copy.

    IMPORTANT: DO NOT proceed to the next step until all databases have been reset.

  12. Enable block mode replication. Using registry editor navigate to HKLM \Software\Microsoft\ExchangeServer \V14 \Replay, and then remove the DisableGranularReplication DWORD value.

  13. Restart the replication service on each DAG member.

    Restart-Service MSExchangeREPL

  14. Validate database health using Get-MailboxDatabaseCopyStatus.

    [PS] C:\>Get-MailboxDatabaseCopyStatus *

    Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex Length Length State
    ---- ------ --------- ----------- -------------------- ------------
    SectorTest\MBX-1 Healthy 0 0 3/19/2013 2:25:56 PM Healthy
    SectorTest\MBX-2 Mounted 0 0 Healthy
    SectorTest\MBX-3 Healthy 0 230 3/19/2013 2:25:56 PM Healthy

  15. Dump the header of a log file and verify that the new sector size is reflected in the log file stream. To do this, open a command prompt and navigate to the log file directory for the database on the active node. Run eseutil /ml against any log within the directory, and verify that the sector size reflects 4096 and (matches).

    Z:\SectorTest>eseutil /ml E0100000001.log

    Extensible Storage Engine Utilities for Microsoft(R) Exchange Server
    Version 14.02
    Copyright (C) Microsoft Corporation. All Rights Reserved.

    Initiating FILE DUMP mode... 
          Base name: E01
          Log file: E0100000001.log
          lGeneration: 1 (0x1)
          Checkpoint: (0x17B,FFFF,FFFF)
          creation time: 03/19/2013 13:56:11
          prev gen time: 00/00/1900 00:00:00
          Format LGVersion: (7.3704.16.2)
          Engine LGVersion: (7.3704.16.2)
          Signature: Create time:03/19/2013 13:56:11 Rand:2996669 Computer:
          Env SystemPath: z:\SectorTest\
          Env LogFilePath: z:\SectorTest\
         Env Log Sec size: 4096 (matches)
          Env (CircLog,Session,Opentbl,VerPage,Cursors,LogBufs,LogFile,Buffers)
              (    off,   1227,  61350,  16384,  61350,   2048,    256,  44204)
          Using Reserved Log File: false
          Circular Logging Flag (current file): off
          Circular Logging Flag (past files): off
          Checkpoint at log creation time: (0x1,1,0) 
          Last Lgpos: (0x1,2,0)
    Number of database page references:  0
    Integrity check passed for log file: E0100000001.log
    Operation completed successfully in 0.250 seconds.

If the above steps have been completed successfully, and the log file sequence recognizes a 4096 sector size, then this issue has been resolved.

This guidance was validated in the following configurations:

  • Windows 2008 R2 Enterprise with Exchange 2010 Service Pack 2
  • Windows 2008 R2 Enterprise with Exchange 2010 Service Pack 3
  • Windows 2008 SP2 Enterprise with Exchange 2010 Service Pack 3
  • Windows 2012 Datacenter with Exchange 2010 Service Pack 3

Tim McMichael

Troubleshoot your Exchange 2010 database backup functionality with VSSTester script

$
0
0

Frequently in support, we encounter several backup related calls for Exchange 2010 databases. A sample of common issues we hear from our customers are:

  • “My backup software is not able to take a successful snapshot of the databases”
  • “My backups have been failing for quite a while. I have several thousand log files consuming disk space and I will eventually run out of disk space”
  • “My backup software indicates that the backup is successful but at the end of my backup, logs do not truncate”
  • “The Exchange Writer /VSS writer is not in a stable state (state is listed as ‘Retryable‘, ’Waiting for completion‘ or ’Failed’)”
  • “We suspect that the Volume Shadow Copy Service (VSS) is failing on the server and hence there are no successful backups”

It is critical to understand how backups and log truncation work in Exchange 2010. If you haven't already done so, check out our three-part blog series by Jesse Tedoff on backups and log truncation in Exchange 2010, Everything You Need to Know About Exchange Backups*.

When troubleshooting backups in Exchange 2010 we are interested in two writers– the Exchange Information Store Writer (utilized for active copy backups) and the Exchange Replica Writer (utilized for passive copy backups). The writers are responsible for providing the metadata information for databases to the VSS Requestor (aka the backup software). The VSS Provider is the component that creates and maintains shadow copies. At the end of successful backups, when the Volume Shadow Copy Service signals backup is complete, the writers initiate post-backup steps which include updating the database header and performing log truncation. (For more details, see Exchange VSS Writers on MSDN.)

As explained above, it is the responsibility of the VSS Requestor to get metadata information from Exchange writers and at the end of successful backup, VSS service signals backup complete to the Exchange writers so the writers can perform post-backup operations.

The purpose of this blog is to discuss the VSSTester script, its functionality and how it can help diagnose backup problems.

What does the script to?

The script has two major functions:

  1. Perform Diskshadow backup of a selected Exchange database so we can exercise the VSS framework in the system, so at the end of a successful snapshot, database header is updated and log files are truncated. We will discuss in detail what Diskshadow is and what it does.
  2. The second function of this script is to collect diagnostic data. For backup cases, there is a lot of data that needs to be collected. To get the diagnostic data you may have to manually go to different places in the Exchange server and turn on logging. If that is not done correctly, we will miss getting crucial logs during the time of the issue. The script makes the data collection process much easier.

 

Script requirements

  1. The current version of the script works only on Exchange 2010 servers.
  2. The script needs to be run on the Exchange server that is experiencing backup issues. If you are having issues with passive copy backups, please go to the appropriate node in the DAG and run the script. For example: You may have Database A having copies on Server1, Server2 and Server3. Server1 hosts the active copy of the database. If backups of the active copy have previously failed run the script on Server1. Otherwise run script on whichever of the remaining servers has failed previously when backing up the passive copy.
  3. Please ensure that you have enough space on the drive you save the configuration and output files. Exchange and VSS traces, Diagnostic logs can occupy up to several GBs of drive space depending on the time taken for taking backup. For example: Running the script in a lab environment consumed close to 25MB of drive space a minute.
  4. The script is unsigned. On the server where you run the script you will have to set the execution policy to allow unsigned PowerShell scripts. Please see this for how to do this.

The script can be run on any DAG configuration. You can use this to troubleshoot Mailbox and Public folder database backup issues. Databases and log files can be on regular drives or mount points. Mix and match of the two will also work!

Let us discuss in detail the two main functionalities of the script.

Diskshadow functionality and how the script uses it

What is Diskshadow and why do we utilize it in VSSTester script?

Diskshadow.exe is a command line tool built in to Windows Server 2008 operating system family as well as Windows Server 2012. Diskshadow is an in-box VSS requestor. It is utilized to test the functionality provided by the Volume Shadow Copy Service (VSS). For more details on Diskshadow please visit:

http://technet.microsoft.com/en-us/library/ee221016(v=ws.10).aspx

http://blogs.technet.com/b/josebda/archive/2007/11/30/diskshadow-the-new-in-box-vss-requester-in-windows-server-2008.aspx

The best part about Diskshadow is that it includes a script mode for automating tasks. This feature of Diskshadow is utilized in the VSSTester. The shadow copy done by Diskshadow is a snapshot of the entire volume at a given point in time. This copy is read-only.

More details on how a shadow copy is created, please visit the following link: http://technet.microsoft.com/en-us/library/ee923636(v=ws.10).aspx

During the course of the blog post, I will be mentioning the term “Diskshadow backup”. It is very important to understand that the term “backup” is relative here. Diskshadow uses the VSS service and gets the appropriate writer to be utilized for the snapshot. The writer will provide the metadata information of database /log files to the Diskshadow. After which Diskshadow utilizes the VSS Provider to create a shadow copy.

After a successful shadow copy /snapshot of databases and log files, the VSS Provider signals an end-backup to Exchange writers. To Exchange this looks like a full backup has been performed on the database. The key to understand here is NO data is actually transferred to a device, tape etc. This is only a test! You will see events in the application logs that usually show up when you take a regular backup, but NOdata is actually backed up. Diskshadow has simply run all the backup APIs through the backup process without transferring any data.

The VSS Provider will take a snapshot of all the databases and logs (if present) on the volume. We will be doing a mirrored snapshot of the entire volume at the point in time when Diskshadow was run. Anything that is on the volume will be part of the snapshot. During the Diskshadow backup, we will be utilizing either the Information store writer (for active copy backup) or the Replica Writer (Passive copy backup) to provide the metadata information for the database.

When you use the VSSTester script, it prompts you for a database to be selected to perform the Diskshadow backup. When we take a snapshot of the volume all other databases (if present on the same drive) will be part of the snapshot, but post-backup operations will happen only on the selected database. This is because we will be utilizing either the Information store Writer (Active Copy Backup) or the Replica Writer (Passive copy backup) that is associated with the selected database. DB headers get updated based on VSS Requestor interaction with the Exchange writer that was utilized, which in turn leads to log truncation. Hence the header of the selected database will be only updated and logs will be purged (only for that the selected database) without being backed up.

When would you be interested in utilizing this Diskshadow functionality of the script?

You would be interested to utilize this functionality in almost all scenarios that I discussed at the start of this blog post. In addition to those scenarios another one that is not related to backups sometimes arises:

  • “I had an unexpected high transactional log growth issue in my exchange 2010 environment and now I am on the verge of losing all disk space in the logs directory. I do not have the time to perform a backup to truncate logs and my goal is to safely remove all the log files”

In the scenario mentioned above (and, by the way, if you have that problem, please go here), Exchange administrators would like to avoid causing a service outage by dismounting the database, removing log files and remounting the database. Another downside to manually removing the log files is breaking replication if the database has replicas across Database Availability Group members.

If you are willing to forgo a backup of the log files you can use the Diskshadow functionality of the script to trigger the backup APIs and tell Exchange to truncate the log files. The truncation commands will replicate to the other database copies and purge log files there as well. If successful, the net result is that the database will not go offline for lack of disk space on the log drive, but you will not have the security of retaining those log files for a future restore.

A sample run of the VSSTester script (with Diskshadow functionality)

Let me demonstrate the Diskshadow functionality of the script.

The Script can be downloaded from TechNet gallery here.

The script initializes and gives us the following options.

image

We select the option 1 to test backup using the built-in Diskshadow function.

image

If the path does not exist, the script will create the folder for you.

We gather the server name and verify it is an Exchange 2010 server. The script will check for the VSS writer status on the local machine. If we detect, any of the writers are not in a “Stable” state, the script will exit. You will need to restart the service associated with the writer to get the writers to a stable state (The Replication service for the Replica Writer or the Information Store service for the Exchange Writer).

The script then gets a list of databases present on the local server and displays the database name, if database is mounted or not and what is the server that holds the active copy of the database. You will have to select the number of the database.

Note: If the user does not provide an input, the script will automatically select the last database in the list.

In my case, I selected database mdb5. The number to enter would be 8.

image

The next important check is ensuring that the database’s replicas (if present) are healthy. If we detect that one of the copies is not healthy, the script will exit mentioning that the database copies need to be in healthy status before running the script.

image

The script next detects the location of the database file and log files. We create the Diskshadow configuration file on the fly every time a database is selected. This configuration file is also saved to the location you had specified earlier (in the example screenshots of this blog c:\vsstesterlogs) to save the configuration and output files. In this case the log files are in a mount point and the database file is on a regular volume. The script will add the appropriate volumes to the disk shadow file.

image

The script will then prompt you to provide the drive letters to expose the snapshots. A common question that arises is, do I need to initialize the drive before I specify a drive letter? The answer is no!

You will be specifying a drive letter that is currently not in use, so Diskshadow will create a virtual drive and expose the snapshot. Remember, the virtual drive that exposes the shadow copy is a read-only volume. The shadow copy is a read only copy .If the database and logs are in the same mount point / drive only, one drive letter is required to expose the snapshot, otherwise you will need to provide two different drive letters. One for exposing database snapshot and another for log files.

image

When you select the option to perform the Diskshadow backup, the script will automatically collect Diagnostic logs, ExTRA traces and VSS traces. Also verbose logging is turned on for Diskshadow. Whatever activity the script does is also logged in to transcript log and saved in the output files directory (c:\vsstesterlogs in this example).

image

Note: If you are performing a passive copy backup, ExTRA tracing will also be turned on in the active node. At the end of the script, we turn off ExTRA tracing in the active node and it will be automatically moved to the passive node. The active node ETL will be placed in the logs folder you had specified at the start of the script. .

Now, the main Diskshadow function will execute.

In the screenshots below we have excluded all other writers on the system that are associated with all other databases on the node (that are mounted or be replicas) and we are ONLY utilizing the writer associated with the selected database. This node hosts the passive copy of the database MDB5. Hence, the writer utilized will be associated with the Replication service aka the Microsoft Exchange Replica Writer.

image

image
(please click on above two screenshots to see them)

From the screen shot below, you can see that VSS Provider has taken a successful snapshot of the database and signaled end backup to the replica writer.

image

Now that we performed a successful snapshot of the database and log files, all the logging that was turned on will be turned off. The log files will be consolidated in the logs folder that you specified earlier at the start of the script. The script checks the VSS writer status after the backup is complete.

image

When the snapshot operation is complete, you will be prompted for an option to either remove the snapshot or leave the snapshots exposed in Windows Explorer.

image
(click to view)

I selected the option to remove the snapshot; hence we will be invoking Diskshadow again to delete the snapshot created earlier.

Let us discuss in detail exposing and removing snapshot functionality:

  1. Remove snapshots - The snapshots that were taken earlier (database or log files) will be exposed in the Windows explorer, if the snapshot operation was successful. In this script we expose the snapshots as a drive letter (that you had specified earlier). If you do not want to have a copy of the log files, you may chose this option and the snapshot will be deleted. All the logs that got purged after post-backup will be present in this read only volume and when this volume is removed they will be deleted forever.
  2. Expose Snapshots– You may choose to have the snapshots exposed. Later, if you want to delete the snapshot, please do the following
    • Open Command prompt
    • Type in Diskshadow
    • Delete shadows exposed <volume>

Note: It is highly recommended to take a full backup of the database using your regular backup software after utilizing Diskshadow.

After this, the script collects the application and system logs. The script filters them to cover only the period you started the script to the present. The transcript log is also stopped. The logs will be saved as a text file and saved in the output folder you had specified earlier (c:\vsstesterlogs in this example).

image

The most reliable method to verify log truncation takes place is to get the log sequence before and after the backup. Hence, before running the script I ran eseutil/ml ENN (the log generation prefix associated with database).

image

Post-backup, when I ran the same command, and can see:

image

We can clearly see a difference in the start of the sequence, meaning log truncation has occurred for the database. One more verification that can be done is to check the database header. We can see that the database header got updated to the most recent time, where Diskshadow was run.

image

I ran the script; what have I accomplished?

If the script finished successfully:

  • We were able to successfully test and exercise the underlying VSS framework in the server. Volume Shadow Copy service was able to successfully identify and utilize the Exchange writers in the box
  • The Exchange writers are able to provide the metadata information to the VSS Requestor (Diskshadow)
  • VSS Provider was able to successfully create a snapshot /shadow copy
  • VSS successfully signaled the Exchange writers on backup complete
  • The Exchange writers were able to perform the post snapshot operations which included log truncation.

Let us now look in to the other major functionality of the script.

Enable logging to troubleshoot backup issues

Use this if you do not want to test backup using Diskshadow and you just want to collect diagnostic logs for troubleshooting backup issues.

You may collect the diagnostic logs and have them handy before calling Microsoft Support saving a lot of time in the support incident because you can provide the files at the beginning of the case.

This time we will be selecting option 2 to enable logging.

image

Selecting this option does the majority of the things that the script did earlier, EXCEPT Diskshadow of course!

After checking the writer status, you can select the database to backup. We will be enabling all the logging like before (Diagnostic Logging, ExTRA, VSS tracing). Remember that, even though you would still be selecting one database - diagnostic logging, ExTRA tracing, VSS tracing are not database specific and are turned on at the server level. When you are utilizing the script to troubleshoot backup issues you can select any one database on the server and it will turn on appropriate logging on the server.

After the logging is turned on and traces enabled, you will see:

image
(click to view)

Now you will need to start your regular backup. After the backup completes/fails, you will need to come back to the PowerShell window where you are running the script and use the “ENTER” key to terminate the data collection. The script then disables diagnostic logging and tracing that was turned up earlier. If needed it will copy diagnostic logs from the active node for that database copy as well.

The script will again check for writer status after the backup then collect the application and system logs. It will stop the transcript log as well.

At this point, in order to troubleshoot the issue, you can open a case with Microsoft Support and upload the logs.

I hope this script helps you in better understanding the core concepts in Exchange 2010 backups, thus helping you troubleshoot backup issues! You can utilize Diskshadow to test Volume Shadow Copy Service and also check if the Exchange writers are performing as intended. If Diskshadow completes successfully without any error and you are still experiencing issues with backup software, you may need to contact the backup vendor to further troubleshoot the issue.

Your feedback and comments are most welcome.

Special thanks to Michael Barta for his contribution to the script, Theo Browning and Jesse Tedoff for reviewing the content.

Muralidharan Natarajan

Introducing Message Analyzer, an SMTP header analysis tool in Microsoft Remote Connectivity Analyzer

$
0
0

Microsoft Remote Connectivity Analyzer is a web-based tool that provides administrators and end users with the ability to run connectivity diagnostics for our servers to test common issues with Microsoft Exchange, Lync and Office 365. The tool started as Microsoft Exchange Server Remote Connectivity Analyzer, and based on your feedback we've continued to add functionality to test connectivity with Lync and Office 365, and made other enhancements such as tests for Outlook Anywhere, Exchange Web Services, outbound SMTP, Office 365 Single Sign-On test, support for 10 additional languages and an improved captcha experience.

We're excited to announce Message Analyzer, a brand new addition to the Remote Connectivity Analyzer. Message Analyzer makes reading email headers less painful.

Screenshot: Message Analyzer tab
Figure 1: The new Message Analyzer tab in RCA

SMTP message headers contain a wealth of information which allows you to determine the origins of a message and how it made its way through one or more SMTP servers to its destination. To use Message Analyzer, all you need to do is copy message headers from a message and paste them in the Message Analyzer tab on the RCA web site.

Screenshot 2: Paste message headers in Message Analyzer
Figure 2: Paste message headers in the Message Analyzer

Trying to locate message headers in Outlook 2010 and later? See Hey Outlook 2010, where are my message headers?

Features of the Message Analyzer

Here's a quick look at what you can do with Message Analyzer.

  • View the most important properties and total delivery time at a quick glance.

    Screenshot 3: Message Analyzer tab
    Figure 3: View the most important header properties and delivery time

  • Analyze the received headers and displays the longest delays quickly for easy discovery of sources of message transfer delays.

    Screenshot 4: View where longest message transfer delays occurrd
    Figure 4: Quickly detect where the longest message transfer delays occurred

  • Sort all headers by header name or value.

    Screenshot 5: Sort message headers
    Figure 5: Sort message headers

  • Quickly collapse the sections that you don’t need.

  • All processing is done in your browser, and no private information is shared with Microsoft.

  • Useful for any header, whether generated by Exchange, Office 365, or any other RFC standard SMTP server or agent.

Note, we consider this feature to be in beta for the moment. Please send us feedback and we’ll continue to make improvements.

Check out this update to the RCA at testconnectivity.microsoft.com (short URL: aka.ms/rca).

Stephen Griffin&Scott Landry
On behalf of the entire MCA/RCA team
Follow the team on Twitter - @ExRCA

Public Folders and Exchange Online

$
0
0

Update 6/5/2013: We have updated the blog post to add the link to the first TechNet document on public folder Hybrid scenarios.

“You mean… this is really happening?”

Last November we gave you a teaser about public folders in the new Exchange. We explained how public folders were given a lot of attention to bring their architecture up-to-date, and as a result of this work they would take advantage of the other excellent engineering work put into Exchange mailbox databases over the years. Many of you have given the new public folders a try in Exchange Online and Exchange Server 2013 in your on-premises environments. At this time we would like to give you a bit more detail surrounding the Exchange Online public folder feature set so you can start planning what makes sense for your environment. So, yes, we really meant our beloved public folders were coming to Exchange Online!

How do we move our public folders to Exchange Online?

We are still putting the finishing touches on some of our migration documentation for on-premises Exchange Server environments to Exchange Online. We know there is a lot of interest in this documentation and we are making sure it is as easy to follow as possible. We will update this article with links to the content when more documentation becomes available on TechNet. The following two articles are available now.

Important

Before we cover the migration process at a high level (and very deeply in those TechNet articles!), we want to be very clear everyone understands the following few important points.

  • Public Folder migrations to Exchange Online should not be performed unless all of your users are located in Exchange Online, and/or all of your on-premises users are on Exchange Server 2013.

  • Public folder migrations are a cutover migration. You cannot have some public folders on-premises and some public folders in Exchange Online. There will be a small window of public folder access downtime required when the migration is completed and all public folder connections are moved from on-premises to Exchange Online.

  • Public folder migrations are entirely PowerShell based at this time. Once the migration has completed you can then perform your public folder management in the tool of your choice, EAC or PowerShell.

So what are the steps I can expect to go through?

In the TechNet content we walk you through exactly how to use PowerShell and some scripts provided by the product group to help automate the analysis and content location mapping in Exchange 2013 or Exchange Online. The migration process is similar whether you are doing an on-premises to on-premises migration, or an on-premises to Exchange Online migration with the latter having a couple more twists. Both scenarios will include a few major steps you will go through to migrate your legacy public folder infrastructure. Again, the following section is meant to be an overview and not a complete rendering of what the more detailed step-by-step TechNet documentation contains. Consider this section an appetizer to get you thinking about your migration and what potential caveats may or may not affect you. The information below is tailored more to an Exchange Online migration, but our on-premises customers will also be facing many of the same steps and considerations.

Prepare Your Environment

  • Are my on-premises servers at the necessary patch levels?
    • Exchange 2007 SP3 RU10 or later
    • Exchange 2010 SP3 or later
    • Exchange 2013 RTM CU1 or later
      • The CU1 released on April 2nd 2013 is necessary. Because there is no Service Pack released for Exchange 2013 at this time it is referred to as RTM CU1.
  • Are my Windows Outlook users using client versions at the necessary patch levels?
    • Outlook 2007, 12.0.6665.5000 or later
    • Outlook 2010, 14.0.6126.5000 or later
    • Outlook 2013, 15.0.4420.1017 or later
  • Are all on-premises users on Exchange Server 2013 or have been moved to Exchange Online?

Analyze Your Current Public Folders and Content

(Size limits pertain to Exchange Online)

  • What does my current public folder infrastructure look like?
    • Who has access to what?
    • What is my total content size?
      • Is the total public folder content on Exchange 2007/2010 over 950 GB when Get-PublicFolderStatistics is run? (“Why” is discussed later)
      • Is the total public folder content on Exchange 2013 over 1.25 TB when Get-PublicFolderStatistics is run?
    • Is any single public folder over 15GB that we should trim down first? (“Why” is discussed later)
  • What will my public folder mailbox layout be?
    • Can my content fit within the allowed public folder mailboxes and their quotas?
    • What public folders will go into what public folder mailboxes?

Create the Initial Public Folder Mailboxes

  • Public folder mailboxes are created by the admin so your content has a place to live in Exchange Online. Customers with less than 25GB of content may only need a single public folder mailbox to start, but our scripts will help you determine your starting layout while backend automation will determine if you need more public folder mailboxes down the road. On-premises customers will utilize quota values that make sense for their own deployments.

Begin the Migration Request and Initial Data Sync

  • The initial copy of public folder content from on-premises to Exchange Online is performed. This may take a long time depending on how much content you have. There is no easy way to predict the length of time it will take as there are many variables to consider, but you can monitor the progress via PowerShell. Users will continue using the on-premises public folder infrastructure during this time so there is no impact to the on-premises environment

Perform Delta Syncs of Changed Content

  • These content delta syncs run by the admin help shorten the window of downtime for the finalization process by copying only data changed after the initial migration request copy was performed. Numerous delta syncs may be required in large environments with many public folder servers.

Lock On-premises Public Folders and Finalize the Migration Request

  • Access to the on-premises public folder environment is blocked and a final delta sync of changed data is performed. When this stage is completed your Exchange Online public folders will be ready for user access. The access block is required to prevent any content changes taking place on-premises just before your users connections are transitioned to the Exchange Online public folder environment.

Validate the Exchange Online Public Folder Environment

  • Create new content and permission reports, and compare them to the reports created prior to the migration.
    • If the administrator is happy, the new Exchange Online public folders will then be unlocked for user access.
    • If the administrator feels the migration was not successful, a roll back to the on-premises public folder infrastructure is initiated. However, if any changes were made to Exchange Online public folders such as content, permissions, or folders created/deleted before the rollback is initiated, then those changes will not be replicated to the on-premises infrastructure.

Removal of legacy public folder content

  • The administrator will remove the public folder databases from the on-premises infrastructure.

Microsoft, what can I do/not do with these things in Exchange Online?

Now that we have given you an idea of what the migration process will be let us talk about the feature itself. Starting with the new Office 365, customers of Exchange Online will be able to store, free of charge, approximately 1.25 terabytes of public folder data in the cloud. Yes, you read the right… over a terabyte. The way this works is your tenant will be allowed to create up to fifty (50) public folder mailboxes, each yielding a 25 GB quota. However, when operating in a hybrid environment, public folders can exist only on-premises or in Exchange Online.

Once you complete the migration process of public folders to Exchange Online, the on-premises public folder infrastructure will have its hierarchy locked to prevent user connections and its content frozen at that point in time. By locking the on-premises content we provide you with a way to rollback a migration from Exchange Online, if you deem it necessary. However, as mentioned before, a rollback can result in data loss as no changes made while using the Exchange Online public folder infrastructure are copied back on-premises.

We will support on-premises Exchange Server 2013 users accessing Exchange Online public folders. We will also support Exchange Online users accessing on-premises public folders if you choose to keep your public folder infrastructure local. The below table depicts what users can access what public folder infrastructures. Please note for a hybrid deployment on-premises users must be on Exchange 2013 if you wish for them to access Exchange Online public folders. Also it bears worth repeating that public folders can only exist in one location, on-premises or in Exchange online. You cannot have two different public folder infrastructures being utilized at once.

PF location >2007 On-Premises2010 On-Premises2013 On-PremisesExchange Online
Mailbox version:    
Exchange 2007

Yes

Yes

No

No

Exchange 2010

Yes

Yes

No

No

Exchange 2013

Yes

Yes

Yes

Yes

New Exchange Online

Yes

Yes

Yes

Yes

How is public folder management in Exchange Online performed?

When your public folder content migration is complete or you create public folders for the very first time, you will not have to worry about managing many aspects of public folders in Exchange Online. As you previously read, public folders in Exchange Server 2013 and Exchange Online are now stored within a new mailbox type in the mailbox database. Our on-premises customers will have to create public folder mailboxes, monitor their usage, create new public folder mailboxes when necessary, and split content to different public folder mailboxes as their content grows over time. In Exchange Online we will automatically perform the public folder mailbox management so you may focus your time managing the actual public folders and their content. If we were to peek behind the Exchange Online curtain, we would see two automated processes running at all times to make everything happen:

  1. Automatic public folder moves based on public folder mailbox quota usage
  2. Automatic public folder mailbox creation based on active hierarchy connection count

Let’s go through each one of them, shall we?

1. Automatic public folder moves based on public folder mailbox quota usage

This process actively monitors your public folder mailbox quota usage. This process’ goal ensures you do not inadvertently fill a public folder mailbox and stop it from being able to accept new content for any public folder within it.

When a public folder mailbox reaches the Issue Warning Quota value of 24.5 GB, this process is automatically triggered to redistribute where your public folders currently reside. This may result in Exchange Online simply moving some public folders from the nearly-filled public folder mailbox to another pre-existing public folder mailbox holding less content. However, if there are no public folder mailboxes with enough free space to move public folders into, Exchange Online will automatically create a new public folder mailbox and move some of your public folders into the newly created public folder mailbox. The end result will be all public folder mailboxes being below the Issue Warning Quota.

Public folder moves from one public folder mailbox to another are an online move process similar to normal mailbox moves. Due to the move process being an online experience your users may experience a slight disruption in accessing one or more public folders during the completion phase of the online move process. Any mail destined for mail enabled public folders being moved would be temporarily queued and then delivered once the move request completes.

In case the curious amongst you are wondering, we do not currently prevent customers from lowering the public folder mailbox quota values even though there is no reason you should do that. However, you are prevented from configuring quotas values larger than 25 GB.

Let us take a moment to visualize this process as a picture is worth a thousand words. In the first scenario below a customer currently has to two public folder mailboxes, PFMBX-001 and PFMBX-002. PFMBX-001 contains three public folders while PFMBX-002 contains only one public folder. PFMBX-001 has gone over the IssueWarningQuota value of 24.5 GB and currently contains 24.6 GB of content. When the automatic split process runs in this environment it sees there is plenty of space available in PFMBX-002, and moves a public folder from PFMBX-001 into PFMBX-002. In this example, the final result is two public folder mailboxes with a similar amount of data in each of them. Depending on the size of your folders this process may move a single large public folder, or numerous mall public folders. The example shows a single folder being moved.

image
Scenario 1: Auto split process shuffles public folders from one public folder mailbox to another one.

In a second scenario below, a customer has a single public folder mailbox, PFMBX-001 containing three public folders. PFMBX-001 has gone over the IssueWarningQuota value of 24.5 GB and contains 24.6 GB of content. When the split process runs in this environment it sees there are no other public folder mailboxes available to move public folders into. As a result, the process creates a new empty public folder mailbox, PFMBX-002, and moves some public folders into the new public folder mailbox; the final result is two public folder mailboxes with a similar amount of data in each of them. Again in this example we are showing a single public folder being moved, but the process may determine it has to move many smaller public folders.

image
Scenario 2: Auto split process must create a new empty public folder mailbox before moving a public folder.

One noteworthy limit in Exchange Online which should be mentioned is no single public folder in Exchange Online can be over 25 GB in size due to the underlying public folder mailbox having a 25 GB quota. To give you an idea how much data that is; 25 GB of data is similar to 350,000 items of 75 KB each, or 525,000 items of 50 KB each. In most cases this volume of data can easily be split amongst multiple public folders to avoid a single folder coming anywhere near the 25 GB limit of a single public folder.

Our migration documentation will also suggest if you currently have a single public folder over 15 GB that you try to reduce that public folder’s size to under 15 GB prior to the migration by deleting old content or splitting it into multiple smaller public folders. When we say a single public folder over 15 GB we mean exactly that and it excludes any child folders. Any child folder of a parent folder is not considered part of the 15 GB content limit suggestion for these purposes because the child public folder may reside in a different public folder mailbox if necessary. The reason for this suggestion is two-fold. First, it helps prevent you from triggering the automated split-process as soon as your migration takes place if you were to migrate very large public folders form on-premises. Second, content moved from Exchange 2007/2010 to Exchange Online may result in the reported space utilized by a single public folder increasing by 30%. The increase is due to a more accurate method used by Exchange Server 2013 to calculate space used within a mailbox database compared to earlier versions of Exchange Server. If you were to migrate a single massive public folder residing in on-premises Exchange Server 2007/2010 to Exchange Online this space recalculation may push the single public folder over the 25 GB quota. We want to help you avoid this situation as this would only be noticed once you were well into the data copy portion of the migration, and would cause you lost time having to redo the process all over again.

If you have a particular business requirement which does not allow you to reduce the size of this single massive public folder in one of the ways previously suggested, then we will recommend you retain your entire public folder infrastructure on-premises instead of moving it to Exchange Online as we cannot increase the public folder mailbox quota beyond 25 GB.

2. Automatic public folder mailbox creation based on active hierarchy connection count

The second automated process helps maintain the most optimal user experience accessing public folders in Exchange Online. Exchange Online will actively monitor how many hierarchy connections are being spread across all of your public folder mailboxes. If this value goes over a pre-determined number we will automatically create a new public folder mailbox. Creating the additional public folder mailbox will reduce the number of hierarchy connections accessing each public folder mailbox by scaling the user connections out across a larger number of public folder mailboxes. If you are a customer whom has a small amount of public folder content in Exchange Online, yet you have an extremely large number of active users, then you may see the system create additional public folder mailboxes regardless of your content size.

Ready for another example? In this example we will use low values for explanatory purposes. Let us pretend in Exchange Online we did not want more than two hundred active hierarchy connections per public folder mailbox. The diagram below shows nine hundred users making nine hundred active hierarchy connections across four public folder mailboxes. This scenario will work out to approximately 225 active hierarchy connections per public folder mailbox as the Client Access Servers spread the hierarchy connections across all available public folder mailboxes in the customer’s environment. When Exchange Online monitoring determines the desired number of two hundred active hierarchy connections per public folder mailbox has been exceeded, PFMBX-005 is automatically created. Immediately after creating PFMBX-005, Exchange Online will force a hierarchy sync to PFMBX-005 ensuring it has the most up to date information available regarding public folder structure and permissions before allowing it to accept client hierarchy connections. The end result in this example is we now have five public folder mailboxes accepting nine hundred active hierarchy connections for an average of 180 connections per public folder mailbox, thus assuring all active users have the best interactive experience possible.

image
Scenario 3: Auto split process creates a new public folder mailbox to scale out active hierarchy connections.

Once you begin utilizing the Exchange Online public folder infrastructure we are confident this built-in automation will help our customers focus on doing what they do best, which is running their business. Let us take care of the infrastructure for you so you have more time to spend on your other projects.

Summary

In summary we are extremely excited to deliver public folders in the new Exchange Online to you, our customers. We believe you will find the migration process from on-premises to Exchange Online fairly straightforward and our backend automation will alleviate you from having to manage many aspects of the feature. We really hope you enjoy using the public folders with Exchange Online as much as we enjoyed creating them for you.

Special thanks to the entire Public Folder Feature Crew, Nino Bilic, Tim Heeney, Ross Smith IV and Andrea Fowler for contributing to and validating this data.

Brian Day
Senior Program Manager
Exchange Customer Experience

Ask the Perf Guy: Sizing Exchange 2013 Deployments

$
0
0

Since the release to manufacturing (RTM) of Exchange 2013, you have been waiting for our sizing and capacity planning guidance. This is the first official release of our guidance in this area, and updates to our TechNet content will follow in a future milestone.

As we continue to learn more from our own internal deployments of Exchange 2013, as well as from customer feedback, you will see further updates to our sizing and capacity planning guidance in two forms: changes to the numbers mentioned in this document, as well as further guidance on specific areas not covered here. Let us know what you think we are missing and we will do our best to respond with better information over time.

First, some context

Historically, the Exchange Server product group has used various sources of data to produce sizing guidance. Typically, this data would come from scale tests run early in the product development cycle, and we would then fine-tune that guidance with observations from production deployments closer to final release. Production deployments have included Exchange Dogfood (our internal pre-release deployment that hosts the Exchange team and various other groups at Microsoft), Microsoft IT’s corporate Exchange deployment, and various early adopter programs.

For Exchange 2013, our guidance is primarily based on observations from the Exchange Dogfood deployment. Dogfood hosts some of the most demanding Exchange users at Microsoft, with extreme messaging profiles and many client sessions per user across multiple client types. Many users in the Dogfood deployment send and receive more than 500 messages per day, and typically have multiple Outlook clients and multiple mobile devices simultaneously connected and active. This allows our guidance to be somewhat conservative, taking into account additional overhead from client types that we don’t regularly see in our internal deployments as well as client mixes that might be different from what's considered “normal” at Microsoft.

Does this mean that you should take this conservative guidance and adjust the recommendations such that you deploy less hardware? Absolutely not. One of the many things we have learned from operating our own very high-scale service is that availability and reliability are very dependent on having capacity available to deal with those unexpected peaks.

Sizing is both a science and an art form. Attempting to apply too much science to the process (trying to get too accurate) usually results in not having enough extra capacity available to deal with peaks, and in the end, results in a poor user experience and decreased system availability. On the other hand, there does need to be some science involved in the process, otherwise it’s very challenging to have a predictable and repeatable methodology for sizing deployments. We strive to achieve the right balance here.

Impact of the new architecture

From a sizing and performance perspective, there are a number of advantages with the new Exchange 2013 architecture. As many of you are aware, a couple of years ago we began recommending multi-role deployment for Exchange 2010 (combining the Mailbox, Hub Transport, and Client Access Server (CAS) roles on a single server) as a great way to take advantage of hardware resources on modern servers, as well as a way to simplify capacity planning and deployment. These same advantages apply to the Exchange 2013 Mailbox role as well. We like to think of the services running on the Mailbox role as providing a balanced utilization of resources rather than having a set of services on a role that are very disk intensive, and a set of services on another role that are very CPU intensive.

Another example to consider for the Mailbox role is cache effectiveness. Software developers use in-memory caching to prevent having to use higher-latency methods to retrieve data (like LDAP queries, RPCs, or disk reads). In the Exchange 2007/2010 architecture, processing for operations related to a particular user could occur on many servers throughout the topology. One CAS might be handling Outlook Web App for that user, while another (or more than one) CAS might be handling Exchange ActiveSync connections, and even more CAS might be processing Outlook Anywhere RPC proxy load for that same user. It’s even possible that the set of servers handling that load could be changing on a regular basis. Any data associated with that user stored in a cache would become useless (effectively a waste of memory) as soon as those connections moved to other servers. In the Exchange 2013 architecture, all workload processing for a given user occurs on the Mailbox server hosting the active copy of that user’s mailbox. Therefore, cache utilization is much more effective.

The new CAS role has some nice benefits as well. Given that the role is totally stateless from a user perspective, it becomes very easy to scale up and down as demands change by simply adding or removing servers from the topology. Compared to the CAS role in prior releases, hardware utilization is dramatically reduced meaning that fewer CAS role machines will be required. Additionally, it may make sense for many customers to consider a multi-role deployment in which CAS and Mailbox are co-located – this allows further simplification of capacity planning and deployment, and also increases the number of available CAS which has a positive effect on service availability. Look for a follow up post on the benefits of a multi-role deployment soon.

Start to finish, what’s the process?

Sizing an Exchange deployment has six major phases, and I will go through each of them in this post in some detail.

  1. You begin the process by making sure you fully understand the available guidance on this topic. If you are reading this post, that’s a great start. There may have been updates posted either here on the Exchange team blog, or over on TechNet. Make sure you take a look before proceeding.
  2. The second step is to gather any available data on the existing messaging deployment (if there is one) or estimate user profile requirements if this is a totally new solution.
  3. The third step is perhaps the most difficult. At this point, you need to figure out all of the requirements for the Exchange solution that might impact the sizing process. This can include decisions like the desired mailbox size (mailbox quota), service level objectives, number of sites, number of mailbox database copies, storage architecture, growth plans, deployment of 3rd party products or line-of-business applications, etc. Essentially, you need to understand any aspect of the design that could impact the number of servers, user count, and utilization of servers.
  4. Once you have collected all of the requirements, constraints, and user profile data, it’s time to calculate Exchange requirements. The easiest way to do this is with the calculator tool, but it can also be done manually as I will describe in this post. Clearly the calculator makes the process much easier, so if the calculator is available, use it!
  5. Once the Exchange requirements have been calculated, it’s time to consider various options that are available. For example, there may be a choice between scaling up (deploying fewer larger servers) and scaling out (deploying a larger number of smaller servers), and the options could have various implications on high availability, as well as the total number of hardware or software failures that the solution can sustain while remaining available to users. Another typical decision is around storage architecture, and this often comes down to cost. There are a range of costs and benefits to different storage choices, and the Exchange requirements can often be met by more than one of these options.
  6. The last step is to finalize the design. At this point, it’s time to document all of the decisions that were made, order some hardware, use Jetstress to validate that the storage requirements can be met, and perform any other necessary pre-production lab testing to ensure that the production rollout and implementation will go smoothly.

Gather requirements and user data

The primary input to all of the calculations that you will perform later is the average user profile of the deployment, where the user profile is defined as the sum of total messages sent and total messages received per-user, per-workday (on average). Many organizations have quite a bit of variability in user profiles. For example, a segment of users might be considered “Information Workers” and spend a good part of their day in their mailbox sending and reading mail, while another segment of users might be more focused on other tasks and use email infrequently. Sizing for these segments of users can be accomplished by either looking at the entire system using weighted averages, or by breaking up the sizing process to align with the various segments of users. In general it’s certainly easier to size the whole system as a unit, but there may be specific requirements (like the use of certain 3rd party tools or devices) which will significantly impact the sizing calculation for one or more of the user segments, and it can be very difficult to apply sizing factors to a user segment while attempting to size the entire solution as a unit.

The obvious question in your mind is how to go get this user profile information. If you are starting with an existing Exchange deployment, there are a number of options that can be used, assuming that you aren’t the elusive Exchange admin who actually tracks statistics like this on an ongoing basis. If you are using Exchange 2007 or earlier, you can utilize the Exchange Profile Analyzer (EPA) tool, which will provide overall user profile statistics for your Exchange organization as well as detailed per-user statistics if required. If you are on Exchange 2010, the EPA tool is not an option for you. One potential option is to evaluate message traffic using performance counters to come up with user profile averages on a per-server basis. This can be done by monitoring the MSExchangeIS\Messages Submitted/sec and MSExchangeIS\Messages Delivered/sec counters during peak average periods and extrapolating the recorded data to represent daily per-user averages. I will cover this methodology in a future blog post, as it will take a fair amount of explanation. Another option is to use message tracking logs to generate these statistics. This could be done via some crafty custom PowerShell scripting, or you could look for scripts that attempt to do this work for you already. One of our own consultants points to an example on his blog.

Typical user profiles range from 50-500 messages per-user/per-day, and we provide guidance for those profiles. When in doubt, round up.

image001

The other important piece of profile information for sizing is the average message size seen in the deployment. This can be obtained from EPA, or from the other mentioned methods (via transport performance counters, or via message tracking logs). Within Microsoft, we typically see average message sizes of around 75KB, but we certainly have worked with customers that have much higher average message sizes. This can vary greatly by industry, and by region.

Start with the Mailbox servers

Just as we recommended for Exchange 2010, the right way to start with sizing calculations for Exchange 2013 is with the Mailbox role. In fact, those of you who have sized deployments for Exchange 2010 will find many similarities with the methodology discussed here.

Example scenario

Throughout this article, we will be referring to an example deployment. The deployment is for a relatively large organization with the following attributes:

  • 100,000 mailboxes
  • 200 message/day profile, with 75KB average message size
  • 10GB mailbox quota
  • Single site
  • 4 mailbox database copies, no lagged copies
  • 2U commodity server hardware platform with internal drive bays and an external storage chassis will be used (total of 24 available large form-factor drive bays)
  • 7200 RPM 4TB midline SAS disks are used
  • Mailbox databases are stored on JBOD direct attached storage, utilizing no RAID
  • Solution must survive double failure events

High availability model

The first thing you need to determine is your high availability model, e.g., how you will meet the availability requirements that you determined earlier. This likely includes multiple database copies in one or more Database Availability Groups, which will have an impact on storage capacity and IOPS requirements. The TechNet documentation on this topic provides some background on the capabilities of Exchange 2013 and should be reviewed as part of the sizing process.

At a minimum, you need to be able to answer the following questions:

  • Will you deploy multiple database copies?
  • How many database copies will you deploy?
  • Will you have an architecture that provides site resilience?
  • What kind of resiliency model will you deploy?
  • How will you distribute database copies?
  • What storage architecture will you use?

Capacity requirements

Once you have an understanding of how you will meet your high availability requirements, you should know the number of database copies and sites that will be deployed. Given this, you can begin to evaluate capacity requirements. At a basic level, you can think of capacity requirements as consisting of storage for mailbox data (primarily based on mailbox storage quotas), storage for database log files, storage for content indexing files, and overhead for growth. Every copy of a mailbox database is a multiplier on top of these basic storage requirements. As a simplistic example, if I was planning for 500 mailboxes of 1GB each, the storage for mailbox data would be 500GB, and then I would need to apply various factors to that value to determine the per-copy storage requirement. From there, if I needed 3 copies of the data for high availability, I would then need to multiply by 3 to obtain the overall capacity requirement for the solution (all servers). In reality, the storage requirements for Exchange are far more complex, as you will see below.

Mailbox size

To determine the actual size of a mailbox on disk, we must consider 3 factors: the mailbox storage quota, database white space, and recoverable items.

The mailbox storage quota is what most people think of as the “size of the mailbox” – it’s the user perceived size of their mailbox and represents the maximum amount of data that the user can store in their mailbox on the server. While this is certainly represents the majority of space utilization for Exchange databases, it’s not the only element by which we have to size.

Database whitespace is the amount of space in the mailbox database file that has been allocated on disk but doesn’t contain any in-use database pages. Think of it as available space to grow into. As content is deleted out of mailbox databases and eventually removed from the mailbox recoverable items, the database pages that contained that content become whitespace. We recommend planning for whitespace size equal to 1 day worth of messaging content.

Estimated Database Whitespace per Mailbox = per-user daily message profile x average message size

This means that a user with the 200 message/day profile and an average message size of 75KB would be expected to consume the following whitespace:

200 messages/day x 75KB = 14.65MB

When items are deleted from a mailbox, they are really “soft-deleted” and moved temporarily to the recoverable items folder for the duration of the deleted item retention period. Like Exchange 2010, Exchange 2013 has a feature known as single item recovery which will prevent purging data from the recoverable items folder prior to reaching the deleted item retention window. When this is enabled, we expect to see a 1.2 percent increase in mailbox size for a 14 day deleted item retention window. Additionally, we expect to see a 3 percent increase in the size of the mailbox for calendar item version logging which is enabled by default. Given that a mailbox will eventually reach a steady state where the amount of new content will be approximately equal to the amount of deleted content in order to remain under quota, we would expect the size of the items in the recoverable items folder to eventually equal the size of new content sent & received during the retention window. This means that the overall size of the recoverable items folder can be calculated as follows:

Recoverable Items Folder Size = (per-user daily message profile x average message size x deleted item retention window) + (mailbox quota size x 0.012) + (mailbox quota size x 0.03)

If we carry our example forward with the 200 message/day profile, a 75KB average message size, a deleted item retention window of 14 days, and a mailbox quota of 10GB, the expected recoverable items folder size would be:

(200 messages/day x 75KB x 14 days) + (10GB x 0.012) + (10GB x 0.03)
= 210,000KB + 125,819.12K + 314,572.8KB = 635.16MB

Given the results from these calculations, we can sum up the mailbox capacity factors to get our estimated mailbox size on disk:

Mailbox Size on disk = 10GB mailbox quota + 14.65MB database whitespace + 635.16MB Recoverable Items Folder = 10.63GB

Content indexing

The space required for files related to the content indexing process can be estimated as 20% of the database size.

Per-Database Content Indexing Space = database size x 0.20

In addition, you must additionally size for one additional content index (e.g. an additional 20% of one of the mailbox databases on the volume) in order to allow content indexing maintenance tasks (specifically the master merge process) to complete. The best way to express the need for the master merge space requirement would be to look at the average database file size across all databases on a volume and add 1 database worth of disk consumption to the calculation when determining the per-volume content indexing space requirement:

Per-Volume Content Indexing Space = (average database size x (databases on the volume + 1) x 0.20)

As a simple example, if we had 2 mailbox databases on a single volume and each database consumed 100GB of space, we would compute the per-volume content indexing space requirement like this:

100GB database size x (2 databases + 1) x 0.20 = 60GB

Log space

The amount of space required for ESE transaction log files can be computed using the same method as Exchange 2010. You can find details on the process in the Exchange 2010 TechNet guidance. To summarize the process, you must first determine the base guideline for number of transaction logs generated per-user, per-day, using the following table. As in Exchange 2010, log files are 1MB in size, making the math for log capacity quite straightforward.

Message profile (75 KB average message size)Number of transaction logs generated per day
5010
10020
15030
20040
25050
30060
35070
40080
45090
500100

Once you have the appropriate value from the table which represents guidance for a 75KB average message size, you may need to adjust the value based on differences in the target average message size. Every time you double the average message size, you must increase the logs generated per day by an additional factor of 1.9. For example:

Transaction logs at 200 messages/day with 150KB average message size = 40 logs/day (at 75KB average message size) x 1.9 = 76

Transaction logs at 200 messages/day with 300KB average message size = 40 logs/day (at 75KB average message size) x (1.9 x 2) = 152

While daily log volume is interesting, it doesn’t represent the entire requirement for log capacity. If traditional backups are being used, logs will remain on disk for the interval between full backups. When mailboxes are moved, that volume of change to the target database will result in a significant increase in the amount of logs generated during the day. In a solution where Exchange native data protection is in use (e.g., you aren’t using traditional backups), logs will not be truncated if a mailbox database copy is failed or if an entire server is unreachable unless an administrator intervenes. There are many factors to consider when sizing for required log capacity, and it is certainly worth spending some time in the Exchange 2010 TechNet guidance mentioned earlier to fully understand these factors before proceeding. Thinking about our example scenario, we could consider log space required per database if we estimate the number of users per database at 65. We will also assume that 1% of our users are moved per week in a single day, and that we will allocate enough space to support 3 days of logs in the case of failed copies or servers.

Log Capacity to Support 3 Days of Truncation Failure = (65 mailboxes/database x 40 logs/day x 1MB log size) x 3 days = 7.62GB

Log Capacity to Support 1% mailbox moves per week = 65 mailboxes/database x 0.01 x 10.63GB mailbox size = 6.91GB

Total Local Capacity Required per Database = 7.62GB + 6.91GB = 14.53GB

Putting all of the capacity requirements together

The easiest way to think about sizing for storage capacity without having a calculator tool available is to make some assumptions up front about the servers and storage that will be used. Within the product group, we are big fans of 2U commodity server platforms with ~12 large form-factor drive bays in the chassis. This allows for a 2 drive RAID array for the operating system, Exchange install path, transport queue database, and other ancillary files, and ~10 remaining drives to use as mailbox database storage in a JBOD direct attached storage configuration with no RAID. Fill this server up with 4TB SATA or midline SAS drives, and you have a fantastic Exchange 2013 server. If you need even more storage, it’s quite easy to add an additional shelf of drives to the solution.

Using the large deployment example and thinking about how we might size this on the commodity server platform, we can consider a server scaling unit that has a total of 24 large form-factor drive bays containing 4TB midline SAS drives. We will use 2 of those drives for the OS & Exchange, and the remaining drive bays will be used for Exchange mailbox database capacity. Let’s use 12 of those drive bays for databases – that leaves 10 remaining drive bays that could contain spares or remain empty. For this sizing exercise, let’s also plan for 4 databases per drive. Each of those drives has a formatted capacity of ~3725GB. The first step in figuring out the number of mailboxes per database is to look at overall capacity requirements for the mailboxes, content indexes, and required free space (which we will set to 5%).

To calculate the maximum amount of space available for mailboxes, let’s apply a formula (note that this doesn’t consider space for logs – we will make sure that the volume will have enough space for logs later in the process). First, we can remove our required free space from the available storage on the drive:

Available Space (excluding required free space) = Formatted capacity of the drive x (1 – free space)

Then we can remove the space required for content indexing. As discussed above, the space required for content indexing will be 20% of the database size, with an additional 20% of one database for content indexing maintenance tasks. Given the additional 20% requirement, we can’t model the overall space requirement as a simple 20% of the remaining space on the volume. Instead we need to compute a new percentage that takes the number of databases per-volume into consideration.

image016

Now we can remove the space for content indexing from our available space on the volume:

image017

And we can then divide by the number of databases per-volume to get our maximum database size:

image018

In our example scenario, we would obtain the following result:

image019

Given this value, we can then calculate our maximum users per database (from a capacity perspective, as this may change when we evaluate the IO requirements):

image020

Let’s see if that number is actually reasonable given our 4 copy configuration. We are going to use 16-node DAGs for this deployment to take full advantage of the scalability and high-availability benefits of large DAGs. While we have many drives available on our selected hardware platform, we will be limited by the maximum of 50 database copies per-server in Exchange 2013. Considering this maximum and our desire to have 4 databases per volume, we can calculate the maximum number of drives for mailbox database usage as:

image021

With 12 database volumes and 4 database copies per-volume, we will have 48 total database copies per server.

image022

With 66 users per database and 100,000 total users, we end up with the following required DAG count for the user population:

image023

In this very large deployment, we are using a DAG as a unit of scale or “building block” (e.g. we perform capacity planning based on the number of DAGs required to meet demand, and we deploy an entire DAG when we need additional capacity), so we don’t intend to deploy a partial DAG. If we round up to 8 DAGs we can compute our final users per database count:

image024

With 65 users per-database, that means we will expect to consume the following space for mailbox databases:

Estimated Database Size = 65 users x 10.63GB = 690.95GB
Database Consumption / Volume = 690.95GB x 4 databases = 2763.8GB

Using the formula mentioned earlier, we can compute our estimated content index consumption as well:

690.95GB database size x (4 databases + 1) x 0.20 = 690.95GB

You’ll recall that we computed transaction log space requirements earlier, and it turns out that we magically computed those values with the assumption that we would have 65 users per-database. What a pleasant coincidence! So we will need 14.53GB of space for transaction logs per-database, or to get a more useful result:

Log Space Required / Volume = 14.53GB x 4 databases = 58.12GB

To sum it up, we can estimate our total per-volume space utilization and make sure that we have plenty of room on our target 4TB drives:

image029

Looks like our database volumes are sized perfectly!

IOPS requirements

To determine the IOPS requirements for a database, we look at the number of users hosted on the database and consider the guidance provided in the following table to compute total required IOPS when the database is active or passive.

Messages sent or received per mailbox per dayEstimated IOPS per mailbox (Active or Passive)
500.034
1000.067
1500.101
2000.134
2500.168
3000.201
3500.235
4000.268
4500.302
5000.335

For example, with 50 users in a database, with an average message profile of 200, we would expect that database to require 50 x 0.134 = 6.7 transactional IOPS when the database is active, and 50 x 0.134 = 6.7 transactional IOPS when the database is passive. Don’t forget to consider database placement which will impact the number of databases with IOPS requirements on a given storage volume (which could be a single JBOD drive or might be a more complex storage configuration).

Going back to our example scenario, we can evaluate the IOPS requirement of the solution, recalling that the average user profile in that deployment is the 200 message/day profile. We have 65 users per database and 4 databases per JBOD drive, so we can estimate our IOPS requirement in worst-case (all databases active) as:

65 mailboxes x 4 databases per-drive x 0.134 IOPS/mailbox at 200 messages/day profile = ~34.84 IOPS per drive

Midline SAS drives typically provide ~57.5 random IOPS (based on our own internal observations and benchmark tests), so we are well within design constraints when thinking about IOPS requirements.

Storage bandwidth requirements

While IOPS requirements are usually the primary storage throughput concern when designing an Exchange solution, it is possible to run up against bandwidth limitations with various types of storage subsystems. The IOPS sizing guidance above is looking specifically at transactional (somewhat random) IOPS and is ignoring the sequential IO portion of the workload. One place that sequential IO becomes a concern is with storage solutions that are running a large amount of sequential IO through a common channel. A common example of this type of load is the ongoing background database maintenance (BDM) which runs continuously on Exchange mailbox databases. While this BDM workload might not be significant for a few databases stored on a JBOD drive, it may become a concern if all of the mailbox database volumes are presented through a common iSCSI or Fibre Channel interface. In that case, the bandwidth of that common channel must be considered to ensure that the solution doesn’t bottleneck due to these IO patterns.

In Exchange 2013, we expect to consume approximately 1MB/sec/database copy for BDM which is a significant reduction from Exchange 2010. This helps to enable the ability to store multiple mailbox databases on the same JBOD drive spindle, and will also help to avoid bottlenecks on networked storage deployments such as iSCSI. This bandwidth utilization is in addition to bandwidth consumed by the transactional IO activity associated with user and system workload processes, as well as storage bandwidth consumed by the log replication and replay process in a DAG.

Transport storage requirements

Since transport components (with the exception of the front-end transport component on the CAS role) are now part of the Mailbox role, we have included CPU and memory requirements for transport with the general Mailbox role requirements described later. Transport also has storage requirements associated with the queue database. These requirements, much like I described earlier for mailbox storage, consist of capacity factors and IO throughput factors.

Transport storage capacity is driven by two needs: queuing (including shadow queuing) and Safety Net (which is the replacement for transport dumpster in this release). You can think of the transport storage capacity requirement as the sum of message content on disk in a worst-case scenario, consisting of three elements:

  • The current day’s message traffic, along with messages which exist on disk longer than normal expiration settings (like poison queue messages)
  • Queued messages waiting for delivery
  • Messages persisted in Safety Net in case they are required for redelivery

Of course, all three of these factors are also impacted by shadow queuing in which a redundant copy of all messages is stored on another server. At this point, it would be a good idea to review the TechNet documentation on Transport High Availability if you aren’t familiar with the mechanics of shadow queuing and Safety Net.

In order to figure out the messages per day that you expect to run through the system, you can look at the user count and messaging profile. Simply multiplying these together will give you a total daily mail volume, but it will be a bit higher than necessary since it is double counting messages that are sent within the organization (i.e. a message sent to a coworker will count towards the profile of the sending user as well as the profile of the receiving user, but it’s really just one message traversing the system). The simplest way to deal with that would be to ignore this fact and oversize transport, which will provide additional capacity for unexpected peaks in message traffic. An alternative way to determine daily message flow would be to evaluate performance counters within your existing messaging system.

To determine the maximum size of the transport database, we can look at the entire system as a unit and then come up with a per-server value.

Overall Daily Messages Traffic = number of users x message profile

Overall Transport DB Size = average message size x overall daily message traffic x (1 + (percentage of messages queued x maximum queue days) + Safety Net hold days) x 2 copies for high availability

Let’s use the 100,000 user sizing example again and size the transport database using the simple method.

Overall Transport DB Size = 75KB x (100,000 users x 200 messages/day) x (1 + (50% x 2 maximum queue days) + 2 Safety Net hold days) x 2 copies = 11,444GB

In our example scenario, we have 8 DAGs, each containing 16-nodes, and we are designing to handle double node failures in each DAG. This means that in a worst-case failure event we would have 112 servers online with 2 failed servers in each DAG. We can use this value to determine a per-server transport DB size:

image034

Sizing for transport IO throughput requirements is actually quite simple. Transport has taken advantage of many of the IO reduction changes to the ESE database that have been made in recent Exchange releases. As a result, the number of IOPS required to support transport is significantly lower. In the internal deployment we used to produce this sizing guidance, we see approximately 1 DB write IO per message and virtually no DB read IO, with an average message size of ~75KB. We expect that as average message size increases, the amount of transport IO required to support delivery and queuing would increase. We do not currently have specific guidance on what that curve looks like, but it is an area of active investigation. In the meantime, our best practices guidance for the transport database is to leave it in the Exchange install path (likely on the OS drive) and ensure that the drive supporting that directory path is using a protected write cache disk controller, set to 100% write cache if the controller allows optimization of read/write cache settings. The write cache allows transport database log IO to become effectively “free” and allows transport to handle a much higher level of throughput.

Processor requirements

Once we have our storage requirements figured out, we can move on to thinking about CPU. CPU sizing for the Mailbox role is done in terms of megacycles. A megacycle is a unit of processing work equal to one million CPU cycles. In very simplistic terms, you could think of a 1 MHz CPU performing a megacycle of work every second. Given the guidance provided below for megacycles required for active and passive users at peak, you can estimate the required processor configuration to meet the demands of an Exchange workload. Following are our recommendations on the estimated required megacycles for the various user profiles.

Messages sent or received per mailbox per dayMcycles per User, Active DB Copy or Standalone (MBX only)Mcycles per User, Active DB Copy or Standalone (Multi-Role)Mcycles per User, Passive DB Copy
502.132.660.69
1004.255.311.37
1506.387.972.06
2008.5010.632.74
25010.6313.283.43
30012.7515.944.11
35014.8818.594.80
40017.0021.255.48
45019.1323.916.17
50021.2526.566.85

The second column represents the estimated megacycles required on the Mailbox role server hosting the active copy of a user’s mailbox database. In a DAG configuration, the required megacycles for the user on each server hosting passive copies of that database can be found in the fourth column. If the solution is going to include multi-role (Mailbox+CAS) servers, use the value in the third column rather than the second, as it includes the additional CPU requirements for the CAS role.

It is important to note that while many years ago you could make an assumption that a 500 MHz processor could perform roughly double the work per unit of time as a 250 MHz processor, clock speeds are no longer a reliable indicator of performance. The internal architecture of modern processors is different enough between manufacturers as well as within product lines of a single manufacturer that it requires an additional normalization step to determine the available processing power for a particular CPU. We recommend using the SPECint_rate2006 benchmark from the Standard Performance Evaluation Corporation.

The baseline system used to generate this guidance was a Hewlett-Packard DL380p Gen8 server containing Intel Xeon E5-2650 2 GHz processors. The baseline system SPECint_rate2006 score is 540, or 33.75 per-core, given that the benchmarked server was configured with a total of 16 physical processor cores. Please note that this is a different baseline system than what was used to generate our Exchange 2010 guidance, so any tools or calculators that make assumptions based on the 2010 baseline system would not provide accurate results for sizing an Exchange 2013 solution.

Using the same general methodology we have recommended in prior releases, you can determine the estimated available Exchange workload megacycles available on a different processor through the following process:

  1. Find the SPECint_rate2006 score for the processor that you intend to use for your Exchange solution. You can do this the hard way (described below) or use Scott Alexander’s fantastic Processor Query Toolto get the per-server score and processor core count for your hardware platform.
    1. On the website of the Standard Performance Evaluation Corporation, select Results, highlight CPU2006, and select Search all SPECint_rate2006 results.
    2. Under Simple Request, enter the search criteria for your target processor, for example Processor MatchesE5-2630.
    3. Find the server and processor configuration you are interested in using (or if the exact combination is not available, find something as close as possible) and note the value in the Result column and the value in the # Cores column.
  2. Obtain the per-core SPECint_rate2006 score by dividing the value in the Result column by the value in the # Cores column. For example, in the case of the Hewlett-Packard DL380p Gen8 server with Intel Xeon E5-2630 processors (2.30GHz), the Result is 430 and the # Cores is 12, so the per-core value would be 430 / 12 = 35.83.
  3. To determine the estimated available Exchange workload megacycles on the target platform, use the following formula:

    image035

    Using the example HP platform with E5-2630 processors mentioned previously, we would calculate the following result:

    image036
    x 12 processors = 25,479 available megacycles per-server

Keep in mind that a good Exchange design should never plan to run servers at 100% of CPU capacity. In general, 80% CPU utilization in a failure scenario is a reasonable target for most customers. Given that caveat that the high CPU utilization occurs during a failure scenario, this means that servers in a highly available Exchange solution will often run with relatively low CPU utilization during normal operation. Additionally, there may be very good reasons to target a lower CPU utilization as maximum, particularly in cases where unanticipated spikes in load may result in acute capacity issues.

Going back to the example I used previously of 100,000 users with the 200 message/day profile, we can estimate the total required megacycles for the deployment. We know that there will be 4 database copies in the deployment, and that will help to calculate the passive megacycles required. We also know that this deployment will be using multi-role (Mailbox+CAS) servers. Given this information, we can calculate megacycle requirements as follows:

100,000 users ((10.63 mcycles per active mailbox) + (3 passive copies x 2.74 mcycles per passive mailbox)) = 1,885,000 total mcycles required

You could then take that number and attempt to come up with a required server count. I would argue that it’s actually a much better practice to come up with a server count based on high availability requirements (taking into account how many component failures your design can handle in order to meet business requirements) and then ensure that those servers can meet CPU requirements in a worst-case failure scenario. You will either meet CPU requirements without any additional changes (if your server count is bound on another aspect of the sizing process), or you will adjust the server count (scale out), or you will adjust the server specification (scale up).

Continuing with our hypothetical example, if we knew that the high availability requirements for the design of the 100,000 user example resulted in a maximum of 16 databases being active at any time out of 48 total database copies per server, and we know that there are 65 users per database, we can determine the per-server CPU requirements for the deployment.

(16 databases x 65 mailboxes x 10.63 mcycles per active mailbox) + (32 databases x 65 mailboxes x 2.74 mcycles per passive mailbox) = 11055.2 + 5699.2 = 16,754.4 mcycles per server

Using the processor configuration mentioned in the megacycle normalization section (E5-2630 2.3 GHz processors on an HP DL380p Gen8), we know that we have 25,479 available mcycles on the server, so we would estimate a peak average CPU in worst-case failure of:

image041

That is below our guidance of 80% maximum CPU utilization (in a worst-case failure scenario), so we would not consider the servers to be CPU bound in the design. In fact, we could consider adjusting the CPU selection to a cheaper option with reduced performance getting us closer to a peak average CPU in worst-case failure of 80%, reducing the cost of the overall solution.

Memory requirements

To calculate memory per server, you will need to know the per-server user count (both active and passive users) as well as determine whether you will run the Mailbox role in isolation or deploy multi-role servers (Mailbox+CAS). Keep in mind that regardless of whether you deploy roles in isolation or deploy multi-role servers, the minimum amount of RAM on any Exchange 2013 server is 8GB.

Memory on the Mailbox role is used for many purposes. As in prior releases, a significant amount of memory is used for ESE database cache and plays a large part in the reduction of disk IO in Exchange 2013. The new content indexing technology in Exchange 2013 also uses a large amount of memory. The remaining large consumers of memory are the various Exchange services that provide either transactional services to end-users or handle background processing of data. While each of these individual services may not use a significant amount of memory, the combined footprint of all Exchange services can be quite large.

Following is our recommended amount of memory for the Mailbox role on a per mailbox basis that we expect to be used at peak.

Messages sent or received per mailbox per dayMailbox role memory per active mailbox (MB)
5012
10024
15036
20048
25060
30072
35084
40096
450108
500120

To determine the amount of memory that should be provisioned on a server, take the number of active mailboxes per-server in a worst-case failure and multiply by the value associated with the expected user profile. From there, round up to a value that makes sense from a purchasing perspective (i.e. it may be cheaper to configure 128GB of RAM compared to a smaller amount of RAM depending on slot options and memory module costs).

Mailbox Memory per-server = (worst-case active database copies per-server x users per-database x memory per-active mailbox)

For example, on a server with 48 database copies (16 active in worst-case failure), 65 users per-database, expecting the 200 profile, we would recommend:

16 x 65 x 48MB = 48.75GB, round up to 64GB

It’s important to note that the content indexing technology included with Exchange 2013 uses a relatively large amount of memory to allow both indexing and query processing to occur very quickly. This memory usage scales with the number of items indexed, meaning that as the number of total items stored on a Mailbox role server increases (for both active and passive copies), memory requirements for the content indexing processes will increase as well. In general, the guidance on memory sizing presented here assumes approximately 15% of the memory on the system will be available for the content indexing processes which means that with a 75KB average message size, we can accommodate mailbox sizes of 3GB at 50 message profile up to 32GB at the 500 message profile without adjusting the memory sizing. If your deployment will have an extremely small average message size or an extremely large average mailbox size, you may need to add additional memory to accommodate the content indexing processes.

Multi-role server deployments will have an additional memory requirement beyond the amounts specified above. CAS memory is computed as a base memory requirement for the CAS components (2GB) plus additional memory that scales based on the expected workload. This overall CAS memory requirement on a multi-role server can be computed using the following formula:

image044

Essentially this is 2GB of memory for the base requirement, plus 2GB of memory for each processor core (or fractional processor core) serving active load at peak in a worst-case failure scenario. Reusing the example scenario, if I have 16 active databases per-server in a worst-case failure and my processor is providing 2123 mcycles per-core, I would need:

image045

If we add that to the memory requirement for the Mailbox role calculated above, our total memory requirement for the multi-role server would be:

48.75GB for Mailbox + 4.08GB for CAS = 52.83GB, round up to 64GB

Regardless of whether you are considering a multi-role or a split-role deployment, it is important to ensure that each server has a minimum amount of memory for efficient use of the database cache. There are some scenarios that will produce a relatively small memory requirement from the memory calculations described above. We recommend comparing the per-server memory requirement you have calculated with the following table to ensure you meet the minimum database cache requirements. The guidance is based on total database copies per-server (both active and passive). If the value shown in this table is higher than your calculated per-server memory requirement, adjust your per-server memory requirement to meet the minimum listed in the table.

Per-Server DB CopiesMinimum Physical Memory (GB)
1-108
11-2010
21-3012
31-4014
41-5016

In our example scenario, we are deploying 48 database copies per-server, so the minimum physical memory to provide necessary database cache would be 16GB. Since our computed memory requirement based on per-user guidance including memory for the CAS role (52.83GB) was higher than the minimum of 16GB, we don’t need to make any further adjustments to accommodate database cache needs.

Unified messaging

With the new architecture of Exchange, Unified Messaging is now installed and ready to be used on every Mailbox and CAS. The CPU and memory guidance provided here assumes some moderate UM utilization. In a deployment with significant UM utilization with very high call concurrency, additional sizing may need to be performed to provide the best possible user experience. As in Exchange 2010, we recommend using a 100 concurrent call per-server limit as the maximum possible UM concurrency, and scale out the deployment if the sizing of your deployment becomes bound on this limit. Additionally, voicemail transcription is a very CPU-intensive operation, and by design will only transcribe messages when there is enough available CPU on the machine. Each voicemail message requires 1 CPU core for the duration of the transcription operation, and if that amount of CPU cannot be obtained, transcription will be skipped. In deployments that anticipate a high amount of voicemail transcription concurrency, server configurations may need to be adjusted to increase CPU resources, or the number of users per server may need to be scaled back to allow for more available CPU for voicemail transcription operations.

Sizing and scaling the Client Access Server role

In the case where you are going to place the Mailbox and CAS roles on separate servers, the process of sizing CAS is relatively straightforward. CAS sizing is primarily focused on CPU and memory requirements. There is some disk IO for logging purposes, but it is not significant enough to warrant specific sizing guidance.

CAS CPU is sized as a ratio from Mailbox role CPU. Specifically, we need to get 25% of the megacycles used to support active users on the Mailbox role. You could think of this as a 1:4 ratio (CAS CPU to Mailbox CPU) compared to the 3:4 ratio we recommended in Exchange 2010. One way to compute this would be to look at the total active user megacycles required for the solution, take 25% of that, and then determine the required CAS server count based on high availability requirements and multi-site design constraints. For example, consider the 100,000 user example using the 200 message/day profile:

Total CAS Required Mcycles = 100,000 users x 8.5 mcycles x 0.25 = 212,500 mcycles

Assuming that we want to target a maximum CPU utilization of 80% and the servers we plan to deploy have 25,479 available megacycles, we can compute the required number of servers quite easily:

image048

Obviously we would need to then consider whether the 11 required servers meet our high availability requirements considering the maximum CAS server failures that we must design for given business requirements, as well as the site configuration where some of the CAS servers may be in different sites handling different portions of the workload. Since we specified in our example scenario that we want to survive a double failure in the single site, we would increase our 11 CAS servers to 13 such that we could sustain 2 CAS server failures and still handle the workload.

To size memory, we will use the same formula that was used for Exchange 2010:

Per-Server CAS Memory = 2GB + 2GB per physical processor core

image050

Using the example scenario we have been using, we can calculate the per-server CAS memory requirement as:

image051

In this example, 20.20GB would be the guidance for required CAS memory, but obviously you would need to round-up to the next highest possible (or highest performing) memory configuration for the server platform: perhaps 24GB.

Active Directory capacity for Exchange 2013

Active Directory sizing remains the same as it was for Exchange 2010. As we gain more experience with production deployments we may adjust this in the future. For Exchange 2013, we recommend deploying a ratio of 1 Active Directory global catalog processor core for every 8 Mailbox role processor cores handling active load, assuming 64-bit global catalog servers:

image052

If we revisit our example scenario, we can easily calculate the required number of GC cores required.

image053

Assuming that my Active Directory GCs are also deployed on the same server hardware configuration as my CAS & Mailbox role servers in the example scenario with 12 processor cores, then my GC server count would be:

image054

In order to sustain double failures, we would need to add 2 more GCs to this calculation, which would take us to 7 GC servers for the deployment.

As a best practice, we recommend sizing memory on the global catalog servers such that the entire NTDS.DIT database file can be contained in RAM. This will provide optimal query performance and a much better end-user experience for Exchange workloads.

Hyperthreading: Wow, free processors!

Turn it off. While modern implementations of simultaneous multithreading (SMT), also known as hyperthreading, can absolutely improve CPU throughput for most applications, the benefits to Exchange 2013 do not outweigh the negative impacts. It turns out that there can be a significant impact to memory utilization on Exchange servers when hyperthreading is enabled due to the way the .NET server garbage collector allocates heaps. The server garbage collector looks at the total number of logical processors when an application starts up and allocates a heap per logical processor. This means that the memory usage at startup for one of our services using the server garbage collector will be close to double with hyperthreading turned on vs. when it is turned off. This significant increase in memory, along with an analysis of the actual CPU throughput increase for Exchange 2013 workloads in internal lab tests has led us to a best practice recommendation that hyperthreading should be disabled for all Exchange 2013 servers. The benefits don’t outweigh the negative impact.

You are going to give me a calculator, right?

Now that you have digested all of this guidance, you are probably thinking about how much more of a pain it will be to size a deployment compared to using the Mailbox Role Requirements Calculator for Exchange 2010. You would be right, and we fully understand that. In fact, we are hard at work on a new calculator for Exchange 2013 and we plan to deliver it later this quarter. Stay tuned to the Exchange team blog for an announcement.

Hopefully that leaves you with enough information to begin to properly size your Exchange 2013 deployments. If you have further questions, you can obviously post comments here, but I’d also encourage you to consider attending one of the upcoming TechEd events. I’ll be at TechEd North America as well as TechEd Europe with a session specifically on this topic, and would be happy to answer your questions in person, either in the session or at the “Ask the Experts” event. Recordings of those sessions will also be posted to MSDN Channel9 after the events have concluded.

Jeff Mealiffe
Senior Program Manager Lead
Exchange Customer Experience

Viewing all 607 articles
Browse latest View live




Latest Images