MS Exchange Team Blog

Syndicate content
aka the Microsoft Exchange Team Blog
Updated: 2 hours 40 min ago

It Takes a Long Time…

February 20, 2012 - 10:00am

Following our recent announcement of the release of Update Rollup 1 for Exchange 2010 Service Pack 2 you will see we released a ton of fixes and I wanted to blog about one specifically, and maybe at the same time provide some background into how issues like these come about and how we go about fixing them.

The specific fix is one cunningly referred to as 2556113, with the title, It takes a long time for a user to download an OAB in an Exchange Server 2010 organization.

With a title like that you might be thinking that we simply figured out a way to make OAB downloads ‘faster’. You might start thinking that we did that by just deleting randomly some of the users in the OAB, those you don’t know, the people working in accounting on the fourth floor, for example. Or perhaps we had tried to reduce the details we included in the OAB, perhaps by just removing unnecessary information like family names, office location or phone numbers. Or maybe we simply increased the speed of the Internet. Because that’s really easy.

Well, we didn’t do those; (though we are looking into that whole Internet thing to see what we can do about it, as it sounds awesome) we instead added some logic to ensure that Outlook tries to download the OAB from a CAS closest to itself.

“Why?” you ask. Well, it’s a good question and I reply with “As the KB article says, ‘Consider the following scenario….”

  • You have two Active Directory sites on a slow network in a Microsoft Exchange Server 2010 organization.
  • You have an Exchange Server 2010 Client Access server and an Exchange Server 2010 Mailbox server in one Active Directory site.
  • You have an Exchange Server 2010 Client Access server and add an Office Outlook user in the other Active Directory site.
  • The user whose mailbox is located in the different Active Directory site tries to download the Exchange Offline Address Book (OAB).

In this scenario, it takes a long time to download the OAB.

Well yes. No kidding. It really can. If you have a large OAB, it can really, really take a long time. But let’s expand on the scenario a little, as frankly there’s a bit of information I think you need to know, and having an AD site with nothing but a CAS in it doesn’t seem like a very smart move to most people.

So consider this more detailed scenario instead;

  • You have a centralized deployment. All mailboxes are in one central location.
  • You have lots of small locations where people touch down and work.
  • These locations are connected to the central site with poor networks. Satellite, ISDN, PSTN, tropospheric scatter (I had a customer with one of these once. Brilliant. Until there was a storm), wet piece of string, etc.
  • Your OAB is big. It is large. It is not small. Take your pick of the definition you like best. Suffice to say, it’s of significant size that you care.
  • Your Outlook client tries to download the OAB, and it comes from the central datacenter. So does the Outlook client being used by the person sitting next to you, and the funny looking guy over there in the corner too. All of you are downloading the same OAB. Over the same wet piece of string. It’s getting very slow.

With luck you can see that you are all competing for the same bandwidth, while also trying to work, and even though the BITS client technology used for OAB downloads is good, it’s not really going to help you much.

So you add a CAS to each remote location. In fact, as the diagram detailed in http://technet.microsoft.com/en-us/library/bb232155.aspx suggests. The idea being that the client computer will download the OAB it needs from the local CAS. Well, it might sound like a great idea – but that’s not how Exchange has ever worked. Prior to 2010 SP2 RU1 that is…

How did it work then? And why am I telling you that TechNet lied to you?

Well to answer the first question, the URL the client uses to download the OAB from is provided to the client by the AutoDiscover service. And the AutoDiscover code has always picked a URL for the OAB you should be downloading from the AD site that your mailbox is in, not the AD site your client computer is in.

To answer the second of those questions, you need to first understand that TechNet is never wrong (my friends in UE, like Scott Schnoll get real touchy if you imply their articles are incorrect).  It’s just that sometimes it isn’t right from a certain point of view, either.  TechNet details this as it was part of the original PM specification back when 2007 was being designed. I probably shouldn’t have told you that, but heck, it was. And it didn’t get done. These things happen in a software product with over 20 million lines of code you know when stuff changes all the time. TechNet doesn’t usually lie. Well, not much.

Back to how it works. Just think about it for a moment. You have a 1 GB OAB. And you add a replica of that OAB to a CAS in the remote and distant AD site, where the users are. However they never use it. (Ok, unless their mailboxes are also in the same AD site but that’s not the scenario is it?). That kind of sucks doesn’t it. Yes, it does I hear you say. It looks a bit like this diagram.

Outlook uses the CAS closest to the client computer for the client’s AutoDiscover requests (well, it should, and we’ll come back to that in a moment) but the OAB URL it hands back is for the CAS in the same AD site as the mailbox. So even though we are replicating the OAB to AD Site B, the client pulls the OAB from AD Site A.

So, a large customer with lots of small sites and a whopping OAB tells us this won’t work and downloads are killing whatever WAN bandwidth they have. So, what can we do about this? It turns out there are a few ways to solve this, and I have to add that this is one of the fun bits of my job, trying to figure this kind of thing out. It’s a nerd thing.

  1. They could reduce the size of their OAB, speed up their WAN, move the remote offices closer etc. None of these will fly for them as a solution. Though we did ask.
  2. We could create lots of OABs that have the same content. And specify on a per-user, or per-database level the OAB the user should download. And then we only have that OAB available in the remote location. Therefore AutoDiscover will provide the only URL it can for it, in the remote location. Now this sounds good, except the users move from site to site. And a download then would mean a double slow network hop. Ouch. Scratch that.
  3. Same thing with mailboxes – move the mailboxes to the remote locations… well, they move around plus that would really complicate administration and High Availability and consequently increase cost.
  4. We could do some kind of reverse IP address to AD site mapping thing. Now I believe this was the original way we had planned to solve this, and it’s actually kind of hard. It’s hard because you need to ensure all subnets a client could come from are in AD Sites and Services, and then try and reverse engineer the AD site the user is in, and then look at site link costs and …you get the idea I hope. It’s complex, and defeated by NAT, or if the admin doesn’t list every possible subnet in AD Sites and Services.
  5. We could ‘interfere’ with DNS or the AutoDiscover XML to try and make the client think it is talking to the centralized location but in fact be talking to a local IIS instance. Again, it’s hard, tricky to implement and support and just plain ugly if you’re asking.
  6. Something else. I picked this one, as the others seemed really hard.

So cast your mind back just a few short paragraphs to the sentence that stated “Outlook uses the CAS closest to the client computer for the client’s AutoDiscover requests”, the one that I said I would come back to. Well, it is worth returning to because of something called AutoDiscoverServiceSiteScope.

AutoDiscoverServiceSiteScope is a CAS setting that helps the Outlook client map AD sites to CAS for the purposes of finding the closest CAS to the client for AutoDiscover requests. He does this by seeking out Service Connection Points (SCP’s) which are in fact pointers to the AutoDiscover service.

Here’s how it works. When an Outlook client starts up he heads off to the triangle, sometimes and otherwise known as ‘AD’, and looks for all the SCP’s put there by Exchange setup. He finds a bunch (we hope), and on each is an attribute, the Keywords attribute, which is set/changed/sometimes messed up by the use of Set-ClientAccessServer –AutoDiscoverServiceSiteScope: ADSiteNameA, ADSiteNameB, etc. The Keywords attributes is used to specify which AD sites this CAS is responsible for, for AutoDiscover requests.

When the Outlook client finds more than one SCP he builds himself a list of usable SCP’s by comparing the value stored on the Keywords attribute with his own AD site (which is dynamically updated by the local Netlogon service, when he starts up or changes IP address).

He then builds one list. Either all those that match his AD site (where Keywords attribute = client AD Site) or, if there are none, he puts every SCP in the list. These are the servers he can use for his AutoDiscover requests.

He then starts at the top of the list (which is always in the same order by the way, by date of install) and tries to connect to the URI contained within the ServiceBindingInformation attribute – which is the location of the AutoDiscover service itself. He then posts XML, gets a response etc., and then lives happily ever after. More details for all this good AutoDiscover stuff can be found here.

Why is this interesting? Well this AutoDiscoverServiceSiteScope thing helps Outlook find the CAS closest to the client’s location, assuming the admin has set up the site scopes correctly (and we do tell admins how to do that). So we really don’t need to figure out which CAS is closest to the client once we get the request, as that has already happened by the time the request reaches CAS.

Once that request hits CAS we figure out the settings to return to the client – but then we always forget one thing – that the OAB the user needs, could be local to the CAS we are executing the request on, and instead, we always gave the user a URL from a CAS way, way, over there. And that’s what we needed to fix.

The solution for this is therefore theoretically very simple and it means we don’t have to invent a new way to figure out the closest CAS to the client, as we already have one which works quite well thank you very much.

If we were to make the assumption that the admin has set up AutoDiscoverServiceSiteScope correctly, the CAS the client connects to for AutoDiscover will be the CAS closest to the client. If this assumption holds true, the CAS, when figuring out what to return in the AutoDiscover XML needs to simply check to see if he himself has a copy of the OAB the user should be using – and if so, he simply provides his own OAB URL. Not that for a CAS in the AD site where the user’s mailbox is located. Of course if he doesn’t have a copy of the OAB the user needs, the old behavior should prevail, meaning the CAS will return the OAB URL of a CAS in the Mailbox AD site.

So basically the picture changes to look like this;

Now that’s much friendlier to the WAN isn’t it? One copy replicates over the WAN and all clients in that location will now get the OAB from the CAS local to them.

What do you have to do to get this new behavior to kick in? Just two things. Deploy SP2 RU1 on the CAS, and ensure that your AutoDiscoverServiceSiteScope parameters are set up correctly.

I hope you find this useful, and may your WAN forever be a long fat pipe.

Greg Taylor
Principal Program Manager
Exchange Customer Experience

Categories: Exchange feeds

Exchange 2010 SP2 RU1 and CAS-to-CAS Proxy Incompatibility

February 17, 2012 - 5:35pm

We wanted to give you a heads up regarding a change in CAS to CAS proxy behavior between servers running Exchange 2010 SP2 RU1 and servers running older versions of Exchange.

The SP2 RU1 package introduced a change to the user context cookie which is used in CAS-to-CAS proxying. An unfortunate side-effect is a temporary incompatibility between SP2 RU1 servers and servers running earlier versions of Exchange. The change is such that earlier versions of Exchange do not understand the newer cookie used by the SP2 RU1 server. As a result, proxying from SP2 RU1 to an earlier version of Exchange will fail with the following error:

Invalid user context cookie found in proxy response

The server might show exceptions in the event log, such as the following:

Event ID: 4999
Log Name: Application
Source: MSExchange Common
Task Category: General
Level: Error
Description: Watson report about to be sent for process id: 744, with parameters: E12, c-RTL-AMD64, 14.02.0283.003, OWA, M.E.Clients.Owa, M.E.C.O.C.ProxyUtilities.UpdateProxyUserContextIdFromResponse, M.E.C.O.Core.OwaAsyncOperationException, 413, 14.02.0283.003.

Not all customers are affected by this. But since we received a few questions about this, we wanted to let you know about the change. Many Exchange customers do not use proxying between Exchange 2010 and Exchange 2007 but rather use redirection, which is not affected by the change. However, if you are using CAS-to-CAS proxying, where an Exchange 2010 SP2 RU1 Client Access server is proxying to an earlier version of Exchange 2010 or Exchange 2007 Client Access server, then you are affected by the change.

If you are affected, it is important to note that this issue is temporary and will exist only until all of the CAS involved in the CAS-to-CAS proxy process are updated to Exchange 2010 SP2 RU1. Thus, if you are affected by this problem, simply deploy SP2 RU1 on the relevant Exchange 2010 servers and the issue no longer exists.

If you use CAS-to-CAS proxy between Exchange 2010 and Exchange 2007, we will have an interim update (IU) for Exchange 2007. Availability of the IU will be announced on this blog.

Server proxy version Server being proxied to Action to take Exchange 2010 SP2 RU1 Any version of Exchange 2010 older than SP2 RU1 Apply Exchange 2010 SP2 RU1 to all servers involved in proxy process Exchange 2010 SP2 RU1 Exchange 2007 Hold off deployment of Exchange 2010 SP2 RU1 until you deploy the Exchange 2007 IU

The Exchange Team

Categories: Exchange feeds

Geek Out With Perry on immutability of email data

February 15, 2012 - 8:41pm

As we’ve seen with previous episodes of Geek Out with Perry and on Perry Clarke’s blog, email archiving can be a heated and controversial topic. It’s one that people are very passionate about – including the folks on the Exchange team and Perry himself. We’ve already covered tiered storage and stubbing as well our archiving methodology in previous blogs and videos but Perry’s new post and video takes on another common question: “How does Exchange help me with immutability of my email data?” Read his blog and watch the video to see his take on what immutability is and how Exchange can help customers with their compliance requirements. For additional details on achieving immutability, you can also check out our immutability whitepaper.

We’ve also heard feedback recently that some of you would like alternate ways to view the Geek Out with Perry video series. Ask and ye shall receive! We now have two options for you to view Geek Out with Perry:

  • The Exchange YouTube channel, which features other awesome Exchange videos you should check out. To view the entire Geek Out with Perry playlist, click here.
  • The MSN Video catalogue which hosts all of the Exchange TechNet videos. To view the entire Geek Out with Perry playlist from that channel, click here.

We love geeking out on Exchange topics and want to hear your feedback and questions. Please let us know if you have other subjects you’d like to have Perry geek out on.

Cheers!

Ann Vu

Categories: Exchange feeds

Released: Update Rollup 1 for Exchange 2010 Service Pack 2

February 13, 2012 - 4:03pm

Earlier today the Exchange CXP team released Update Rollup 1 for Exchange Server 2010 SP2 to the Download Center.

This update contains a number of customer-reported and internally found issues since the release of SP2. See KB 2645995: Description of Update Rollup 1 for Exchange Server 2010 Service Pack 2' for more details.

Note: If some of the following KB articles do not work yet, please try again later.

We would like to specifically call out the following fixes which are included in this release:

  • New updates for Dec DST - Exchange 2010 - SP2 RU1 - Display name for OWA.
  • 2616230 Exchange 2010 CAS server treats UTF-7 encoding NAMESPACE string from CHS Exchange 2003 BE server as ASCII, caused IMAP client fails to login.
  • 2599663 RCA crashes when recipient data is stored in bad format.
  • 2492082 Freebusy publish to Public Folders fails with 8207 event.
  • 2666233 Manage hybrid configuration wizard won't accept domains starting with a numeral for FOPE outbound connector FQDN.
  • 2557323 "UseLocalReplicaForFreeBusy" functionality needed in Exchange 2010.
  • 2621266 Exchange 2010 Mailbox Databases not reclaiming space.
  • 2543850 Exchange 2010 GAL based Outlook rule not filtering emails correctly.
General Notes:

For DST Changes: http://www.microsoft.com/time.

Note for Forefront Protection for Exchange users  For those of you running Forefront Protection for Exchange, be sure you perform these important steps from the command line in the Forefront directory before and after this rollup's installation process. Without these steps, Exchange services for Information Store and Transport will not start after you apply this update. Before installing the update, disable ForeFront by using this command: fscutility /disable. After installing the update, re-enable ForeFront by running fscutility /enable.

Exchange Team

Categories: Exchange feeds

Announcing the Exchange Client Network Bandwidth Calculator Beta

February 10, 2012 - 12:45pm

I am extremely pleased to announce that the all new Exchange Client Bandwidth Calculator Beta is available for download!!

Over the past 12 months we have been working on a new calculator to help with Exchange client network bandwidth approximation. This new calculator is based on all new prediction data and is designed to work with both Exchange on-premises and Office 365 deployments! (Yes, we know it’s long overdue!)

What does it do?

The brief was concise and simple for this calculator; we wanted to be able to predict the client network bandwidth requirements for a specific set of users. The calculator needed to deal with Outlook, OWA and Mobile Devices, both on-premises and for Office 365 scenarios.

The following clients are included in this Beta; further clients will be added over time.

  • Outlook 2010
  • Outlook 2007
  • Outlook 2003
  • OWA 2010
  • OWA 2007
  • Windows Mobile
  • Windows Phone
How does it work?

The calculator is based on new prediction algorithms derived after analysing the behaviour of each client individually. This approach allows a bandwidth model to be created for each client scenario which is very scalable and flexible.

Input data is based on existing user profile metrics, such as messages sent and received per user per day and average message size. Once these parameters are provided the calculator is able to predict how much bandwidth each client will require to perform adequately.

The predictions provided represent the requirements during the busiest two hours of the working day.

Why a Beta?

The new prediction algorithms have been created from scratch and validated for accuracy internally; however we would like to gather some more telemetry data from real world scenarios to fine tune the calculator prediction formulae. During the Beta process we would love to hear your feedback and suggestions for the calculator. If you can provide real world prediction vs. observations data for your infrastructure that would also be extremely welcome!

Suggestions and feedback requests should be sent to netcalc@microsoft.com

The goal is to complete the Beta process by mid-2012.

How do I use it?

The calculator is split into two main sheets in Excel.

  • Input – A place to enter organization information and usage profile information
  • Client Mix – A place to enter how many clients of each type and profile exist in each site

There is an accompanying manual that explains things in more detail, so I will only take a quick look here.

The Input Sheet

The input sheet is broken up into five sections;

  1. Organization Data
  2. User Profile 1
  3. User Profile 2
  4. User Profile 3
  5. User Profile 4

The Organization data section represents global settings that apply for the entire organization and the user profiles are pre-defined profiles that represent sets of users from light through to very heavy. The user profiles are customizable and should be edited to reflect your own environment for an accurate prediction.

The Client Mix Sheet

Once you have completed the Input Sheet, you can move on to the Client Mix sheet. This is where you can list out the number of each client and define your sites. The sheet is made up of three sections;

  1. Site Definition
  2. Client Definition
  3. Network Predictions

The site definition section allows you to configure a representative model of your physical network site topology; this should represent physical sites and the expected user usage profile for that site. The Client definition section allows you to configure how many users will exist at each site and which type of Exchange client they will be using. The Network predictions section shows the predicted network requirement for each defined site.

An Example

To make things a little easier I am going to walk through a very basic example to get us started.

In this example we have a customer who is moving to Office 365. They want to know how much Internet bandwidth will be required to support their Exchange clients after the migration is completed.

Organization Information

  • 3 Main sites (3650 users)
    • London:
      • 1500 Outlook 2007 Users (Medium Profile)
      • 300 Outlook Web Access users (Light Profile)
    • Manchester:
      • 600 Outlook 2007 Users (Heavy Profile)
      • 150 Outlook Web Access users (Light Profile)
    • Paris:
      • 1100 Outlook Web Access (Light Profile)

London and Manchester share the same Internet connection, but Paris has its own local breakout.

Note: For this example I am going to use the built-in user profile data to keep things simple, however it is strongly recommended to define your own user profile data based on research into your messaging solution.

The first thing we need to do is to configure the Input Sheet. The defaults are pretty good in this example, but the OAB size is actually 10MB rather than 100MB so I will set the Offline Address Book Size to 10MB.

The user profiles I would usually edit, but in this case I will leave them at their default settings and move on to the Client Mix sheet. The Client Mix sheet will give us totals, so I generally group together sites that share the same internet connection. In this instance that means we can put London and Manchester on the same sheet but we need a new sheet for Paris. To make a copy of the sheet, right click on the tab at the bottom and select Move or Copy; in the Move or Copy Dialog highlight the Client Mix sheet, tick the box to Create a copy and then click OK.

Your Excel workbook will now contain “Client Mix (2)” and “Client Mix” – I generally rename these to something meaningful, in this instance I am going to rename one UK Sites and the other FR Sites.

We will begin by defining the sites in the UK sites sheet. The information we have suggests that we have two sites, London and Manchester and that there are two user profile types in each site. Since we can only use a single user profile site this means we are going to need four site entries…

We then need to define the types of clients that will exist in each site. I have hidden some rows and columns that we don’t need to make the data easier to read. I have entered the number of each client type into the sheet – we know that it must be OA-cached for Outlook 2007 since Office 365 only provides an Outlook Anywhere connection and we know that OWA must be 2010 since again, Office 365 is based on Exchange Server 2010.

If we hide some more cells we can take a better look at the prediction values.

Firstly we are generally interested in the “Exchange to Client” requirements since they are higher and most links are still provided with the same upload and download capacity. Where you have an asynchronous line then you may need to look at the “Client to Exchange” bandwidth also. In this example the customer has a synchronous connection.

London has two sets of users defined and the calculator predicts that the Outlook users will need 3.66Mbits/sec of bandwidth and that the OWA users will require a further 0.93Mbits/sec. The total for London is 4.59Mbits/sec (you need to do this manually in this case).

Manchester also has two sets of users defined and the calculator predicts a total of 3.53Mbits/sec.

Since both Manchester and London share the same internet connection, the calculator is predicting that the customer will need to ensure that 8.12Mbits/sec of network bandwidth is available to support this workload and that the maximum network link latency is 320ms or less.

If we repeat this for our FR Sites tab…

The calculator predicts that the internet connection in Paris will require 1.88Mbits/sec of available network bandwidth to support their 1100 Office 365 OWA users.

This is obviously a fairly simple example but I would encourage you to model your own organization to get a feel for the calculator and provide feedback on how the calculator is working for you.

What doesn’t it do?

The calculator does not provide information on the following…

  • Non-Microsoft clients: You will need to speak to the specific vendors to get bandwidth information for their clients.
  • BlackBerry: I know this is a non-Microsoft client but everyone asks about it! You will need to speak with BlackBerry to get this data.
  • Server-side Bandwidth Data: Data such as SMTP, DAG replication, ADFS 2.0, and authentication etc. are all out of scope for this calculator.
  • Outlook 2011 / Entourage EWS: These clients are being analyzed currently and will be added during the Beta timeframe.
  • Migration Traffic: The calculator predicts steady state traffic requirements
  • Outlook 2000 and older: Outlook 2003 is the oldest client included in this calculator
Feedback and Other Stuff…

We have published other network bandwidth guidance on TechNet, the most commonly used guidance is in the White Paper: Outlook Anywhere Scalability with Outlook 2007, Outlook 2003 and Exchange 2007. This guidance was produced using Loadgen during lab test of CAS scalability. The predictions from this testing vary slightly from those in the new calculator due to the way the data was gathered in each case. Since the newer test data was specifically generated and analyzed to enable network bandwidth prediction the newer values should be more precise; the new calculator also takes into account many more variables and user profiles than the guidance in the Outlook 2007 White Paper, so again this should provide a more accurate prediction.

During the Beta process we recommend that you use both the old white paper and the new calculator to determine your requirements.

As I said at the beginning of this post (which is now much longer than I wanted it to be!), I am really interested to hear feedback from you after using the calculator; positive, negative and requests for help or feature requests etc. Send your feedback (please be nice!) to…

netcalc@microsoft.com

I will be writing some more posts regarding this calculator over the coming months, with more examples and a deep-dive that explains how the prediction data was generated.

Thanks for reading and I hope you find this new calculator useful.

Neil Johnson
Senior Consultant, MCS UK

Categories: Exchange feeds

Recovering Public Folders After Accidental Deletion (Part 2: Public Folder Architecture)

February 8, 2012 - 10:00am
Introduction

In the previous blog entry, I explained how to safely recover accidentally deleted public folders from backup. I briefly mentioned some important public folder concepts in that article, and in this, the second part, I’m going to describe some of the inner workings of public folders themselves.  Each organization maintains a list of all public folders in the environment, as well as the locations of all replicas.  This list is called the hierarchy, and it's common to all public folder stores in the environment.  The hierarchy lists all public folders in the environment as well as which servers host replicas of each folder.  Each public folder store has a copy of the hierarchy, and uses it to provide referrals to end users for public folder replicas on other servers (among other things).  Each public folder store also maintains a table, called the replication state table, which keeps track of the status of each folder.  This table is a critical yet little understood feature of public folders, and it has a huge impact on recovery.

Overview

As I said above, each public folder store maintains a replication state table, but unlike the hierarchy, it's unique to each store.  A public folder store maintains information about the public folders for which it has a replica, not just for itself but for all servers with that replica.  It does this so that it knows which other stores have more up-to-date public folder content, or which ones might have items required for backfill replication (catching up on old or missing items).

Imagine the following scenario:  we have three servers, each hosting a public folder database – PFS1, PFS2, and PFS3.  We have a folder – Folder1 – which is replicated to each database.  If I could peer into the replication state table for PFDB1, I would see an entry for Folder1, and that entry would contain information about Folder1's status not on for PFS1, but also for PFS2 and PFS3.  What kind of information does this table actually contain?  To answer that, we need to dig yet further into public folder structure, and talk about CNs.

Change Numbers

CNs – or, to give their full name, change numbers – are numbers assigned to each modification made to content in a public folder.  Think of them as per-folder odometers – they increment each time a change is made to a folder, and only increase, never decrease. Each public folder assigns CNs to the changes made on a given replica, and that information is transmitted to other replicas.  These other replicas use this information to see if they've already received a particular change.  For example, if I make a change to Folder1 on PFS1, that database might assign change number 211 to that modification.  When the public folder database replicates that change to other databases, it records and transmits that change as FID1-123:PFS1:211.  [Folder1 is represented within the public folder database, and by extension in the replication traffic, by a folder ID (FID). This becomes very important later.] PFS2 receives the replication message and checks to see if it has already received CN 211 from PFS1.  If it hasn't, it applies the change and updates its own entry in the replication state table to reflect the fact that it has now received change 211 for Folder1 (FID1-123) from PFS1.  If PFS3 later replicates the same change (FID1-123:PFS1:211) to PFS2, PFS2 will check its list, see that it has indeed already received that change, and discard that particular replication message.

Here’s a sample hierarchy replication message. Notice the CN min, CN max, and FID entries in the description field.

Event Type: Information
Event Source: MSExchangeIS Public Store
Event Category: Replication Outgoing Messages
Event ID: 3018
Description:
An outgoing replication message was issued.
Type: 0x2
Message ID: <23599A0EB070AA92F03E31C546C9C8FFA4F7@contoso.com>
Database "PFDB"
CN min: 1-11D3, CN max: 1-11D4
RFIs: 1
1) FID: 1-38BF, PFID: 1-1, Offset: 28
        IPM_SUBTREE\TestPF

At any given time, each public folder store knows exactly what content it has, and has a general idea of what content the other public folder stores have.  This is an important point - public folder databases are aware of their environment surroundings.  It's this awareness that has implications for recovery.

The Replication State Table

Here’s a quick visualization of how a public folder change is propagated from one server to another. This table simulates the replication state table which is internal to every server. There are four columns – the first represents the replication details (the CNsets), and the next three represent the same folder on each of the three servers. In essence, this table shows you what each server knows about other server’s knowledge of this particular folder. Please note that this is a simplified version of the replication state table – it’s actually quite a bit more complicated than this, but this is all the detail 99.99% of engineers will ever need.

In this example, Folder1 has been replicated to three systems – PFS1, PFS2, and PFS3 – and public folder replication is fully up-to-date. The servers know what they’ve sent to their replication partners, and what’s been replicated back to them. Since end users could conceivably make updates on any of the servers, they each have their own CN sets for the same folder.

Details From

Folder1 on PFS1

Folder1 on PFS2

Folder1 on PFS3

PFS1

Last sent CN PFS1:10

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

PFS2

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

Last sent CN PFS2:20

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

PFS3

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

Last sent CN PFS3:30

An end user connected to PFS1 makes a change, which PFS1 assigned change number 11. The replication state table on PFS1 is updated to reflect this new CN.

Details From

Folder1 on PFS1

Folder1 on PFS2

Folder1 on PFS3

PFS1

Last sent CN PFS1:11

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

PFS2

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

Last sent CN PFS2:20

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

PFS3

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

Last sent CN PFS3:30

PFS1 packages this change (which we assume is the only one made to Folder1) and sends it to PFS2 and PFS3, which update their own replication state tables.

Details From

Folder1 on PFS1

Folder1 on PFS2

Folder1 on PFS3

PFS1

Last sent CN PFS1:11

FID1-123:PFS1:1-11

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

FID1-123:PFS1:1-11

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

PFS2

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

Last sent CN PFS2:20

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

PFS3

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

Last sent CN PFS3:30

Both PFS2 and PFS3 apply the changes, and since those two systems received the change from PFS1, they also update their “knowledge” of PFS1. Notice that PFS1 does not update its entries for PFS2 and PFS3 – while it has sent the content to them, it hasn’t received confirmation that they’ve applied that change. [Because public folder replication messages are delivered via Hub Transport, public folder stores don’t directly interact and so never assume that the updates were delivered and applied.]

Continuing with our example, an end user makes a change to Folder1 on PFS3:

Details From

Folder1 on PFS1

Folder1 on PFS2

Folder1 on PFS3

PFS1

Last sent CN PFS1:11

FID1-123:PFS1:1-11

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

FID1-123:PFS1:1-11

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

PFS2

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

Last sent CN PFS2:20

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

PFS3

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

Last sent CN PFS3:31

That change is now replicated to PFS1 and PFS2:

Details From

Folder1 on PFS1

Folder1 on PFS2

Folder1 on PFS3

PFS1

Last sent CN PFS1:11

FID1-123:PFS1:1-11

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

FID1-123:PFS1:1-11

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

PFS2

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

Last sent CN PFS2:20

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

PFS3

FID1-123:PFS1:1-11

FID1-123:PFS2:1-20

FID1-123:PFS3:1-31

FID1-123:PFS1:1-11

FID1-123:PFS2:1-20

FID1-123:PFS3:1-31

Last sent CN PFS3:31

Note that when PFS3 sent out its replication message, it included not only its own update, but also the fact that it had received update 11 from PFS1.

Again, while every server has the most up-to-date content for Folder1, they don’t necessarily know that every replica is up-to-date. [PFS1, for example, “thinks” that PFS2 is out of date, while PFS3 “thinks” that both PFS1 and PFS2 are out of date.] It’s important to note that this isn’t a problem – by only encapsulating status messages in outgoing replication, Exchange avoids saturating the network with constant messages from various servers confirming the receipt of recent replication messages.

Backfill Replication

However, from time to time, a server loses its connection to its replication partners, either through network failure, service failure, or other causes. When it does, its replication state table no longer receives updates to the CNs held by its partners for their replicas. In other words, its replication state table is outdated. When that server reconnects with its partners, and receives a new message, it may find that the CN on that new message is much higher than what it expected. Using the previous example, imagine that PFS3 is isolated from PFS1 and PFS2 due to a server failure, and does not receive updates to Folder1 from the other servers for several hours. The resulting table might look like this:

Details From

Folder1 on PFS1

Folder1 on PFS2

Folder1 on PFS3 (OFFLINE)

PFS1

Last sent CN PFS1:16

FID1-123:PFS1:1-16

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

FID1-123:PFS1:1-11

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

PFS2

FID1-123:PFS1:1-16

FID1-123:PFS2:1-28

FID1-123:PFS3:1-30

Last sent CN PFS2:28

FID1-123:PFS1:1-10

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

PFS3

FID1-123:PFS1:1-11

FID1-123:PFS2:1-20

FID1-123:PFS3:1-31

FID1-123:PFS1:1-11

FID1-123:PFS2:1-20

FID1-123:PFS3:1-31

Last sent CN PFS3:31

Notice that PFS1 is aware that the most recent replication message from PFS2, for change number 28, also included information about PFS2’s knowledge of PFS1 (namely, that PFS2 receives PFS1’s update numbers 12 to 16). PFS3 has not received any of these recent updates.

However, when PFS3 is brought back online, and receives a new replication message, it suddenly learns of the missing messages. This triggers a backfill request – a request from PFS3 to the source server for the missing content.

Details From

Folder1 on PFS1

Folder1 on PFS2

Folder1 on PFS3

PFS1

Last sent CN PFS1:17

FID1-123:PFS1:1-17

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

FID1-123:PFS1:1-11, 17

FID1-123:PFS2:1-20

FID1-123:PFS3:1-30

PFS2

FID1-123:PFS1:1-16

FID1-123:PFS2:1-28

FID1-123:PFS3:1-30

Last sent CN PFS2:28

FID1-123:PFS1:1-16

FID1-123:PFS2:1-28

FID1-123:PFS3:1-30

PFS3

FID1-123:PFS1:1-11

FID1-123:PFS2:1-20

FID1-123:PFS3:1-31

FID1-123:PFS1:1-11

FID1-123:PFS2:1-20

FID1-123:PFS3:1-31

Last sent CN PFS3:31

Backfill Request PFS1:12-16

Backfill Request PFS2:21-28

Notice that PFS3 is missing updates 12 through 16 for PFS1, and 21 through 28 for PFS2. PFS3 will request the missing content from any server that it believes has that content, which in this case would mean either PFS1 or PFS2. How does PFS3 know that both servers have the content? Because the replication message from PFS1, which included change number 17, included the information about the CN sets for PFS1, PFS2, and PFS3.

Strictly speaking, Exchange doesn’t issue these backfill requests right away – it waits a few hours (six or more, depending on the situation) before sending them out, just in case one of its replication partners happens to send that missing content. If a specific update hasn’t been received after the backfill timeout is reached, Exchange then generates that backfill request and sends it to the replication partners. This process is detailed in the “Backfill Requests and Backfill Messages” section of the TechNet page on “Understanding Public Folder Replication” at http://technet.microsoft.com/en-us/library/bb629523.aspx#Backfill.

Removing or Deleting Replicas

When you remove a public folder replica, the owning public folder database contacts all other database to find out if they have all of the content that's contained within the replica that's about to be removed.  It does so by sending out a status message that contains the CNs for its replica of the folder. For example, if I were to remove the replica of Folder1 from PFS3, it would send a message to PFS1 and PFS2 confirming that between the two of them, they have every update from PFS3 from 1 to 31. [This is an important point: the content doesn’t need to be on one server. As long as the content exists somewhere in the organization, the replica can be removed.] If PFS3 had any unique content that neither PFS1 nor PFS2 had, it would replicate those items to its replication partners. Once it has confirmed that it no longer has any unique content, the public folder store removes that replica.

However, when you delete a public folder outright (as in, remove all replicas), there's no need to preserve content, so it's deleted from every public folder store.  This is why it’s vital that public folder administrators understand the difference between removing a replica (with Set-PublicFolder -Replicas) and deleting a public folder (with Remove-PublicFolder).

These changes to replica lists and outright deletions are transmitted just like any other public folder change – as hierarchy replication messages, complete with their own CNs.  If I remove the replica of Folder1 from PFS1, that change will go to PFS2 and PFS3 so that they know that they no longer need to replicate new content for Folder1 to PFS1.  Likewise, if I delete Folder1, it will be deleted from all of the databases and removed from the hierarchy as well.  The replication state table keeps track of changes to hierarchy too, and so knows which folders exist in the organization and which don't. It is this tracking mechanism that prevents us from simply restoring a public folder database and reintroducing the deleted folders into the environment.

Recovery of Deleted Public Folders

In part one of this blog, I outlined a process for safely and successfully restoring public folders which were accidentally deleted from the environment. Step six of the procedure reads, in part, “Copy each of the folders you wish to restore. [Although the new folders will have similar names to the originals, the underlying folder IDs (FIDs) are different.]” I’ve added italics to highlight the key point – when you copy (clone) public folders, you’re really creating new folders. They may bear the same name as the originals, but the folder IDs are different. So although my cloned copy of Folder1 may look like the original Folder1, and contain the same items as Folder1, none of the replication messages for the original Folder1 will apply to it, because it’ll have a completely different FID. This new folder is added to the hierarchy, and because end users see the name, not the FID, they’ll simply use it as they would the original folder.

Troubleshooting Replication

If you’re looking for troubleshooting information, look no further than Bill Long’s excellent four-part blog series on public folders:

Summary

Public folders use their own replication mechanism, where changes are tracked in an internal, non-editable table and communicated to replication partners alongside the actual content changes. The public folder hierarchy follows the same principles, and so changes made to the hierarchy are replicated to all public folder databases in the environment. Understanding the replication mechanism helps an administrator understand not only disaster recovery, but troubleshooting as well.

John Rodriguez
Principal Premier Field Engineer
Microsoft Premier Support

Categories: Exchange feeds

Exchange ActiveSync client connectivity in Office 365

February 7, 2012 - 7:50pm

This article explains how mobile devices connect to Exchange Online (Office 365) service and how the connectivity may be impacted if the device does not support certain Exchange ActiveSync (EAS) protocol requirements.

Exchange ActiveSync protocol versions

Most mobile devices that connect to Exchange do so using the Exchange ActiveSync protocol. Each successive version of the protocol offers new capabilities. (The Exchange ActiveSync article maintained by the Exchange community on Wikipedia has more details. -Editor)

Before any device accesses an Exchange mailbox, it negotiates with the Exchange server to determine the highest protocol version that they both support, and then uses this protocol version to communicate. Through the protocol version negotiation, the device and the server agree to behave in a particular manner in accordance with the version selected.

Mailbox redundancy in Office 365

In Office 365, we store multiple copies of user mailboxes, geographically distributed across different sites and datacenters. This redundancy ensures that if one copy of the mailbox fails for some reason (for example due to a hardware failure on a particular server), we can access the same mailbox elsewhere. At any given time, one copy of a particular mailbox is considered active and the remaining ones are deemed passive. When a user connects to their mailbox, they take actions on the active copy, and changes are then propagated to its passive copies.

Mailbox database failover

The switch from one active copy of a mailbox to another one stored on a different mailbox server may happen for the following different reasons:

  • Fail over  If hardware or connectivity failures arise in a site, Exchange 2010 in Office 365 automatically switches (or fails over) to a different mailbox database to ensure continuous access to your mailboxes.
  • Load balancing  If some servers are experiencing higher loads, mailboxes may need to be load-balanced across different servers.
  • Testing or maintenance  Mailbox databases may be switched when we are testing our disaster recovery procedures, or when servers are upgraded.

In most cases, the fail over and load balancing are not scheduled in advance. The process is executed automatically when the need arises, without manual intervention.

Exchange ActiveSync connection process

In Office 365, EAS devices connect to a publicly-facing Exchange Client Access Server (CAS). CAS authenticates the user based upon the provided credentials and retrieves the user’s mailbox version and the mailbox’s location. The mailbox’s location is the Active Directory forest and site where the active copy of the user mailbox is stored.

The CAS will handle the connection in one of the following ways, depending on the mailbox location relative to the location of the CAS:

  • Same forest, same site  If the mailbox is in the same Active Directory site as the CAS, CAS will retrieve the content directly from the Mailbox server.
  • Same forest, different site  If the mailbox is in the same Active Directory forest but a different Active Directory site than the CAS, CAS will redirect or proxy the device to the correct Active Directory site in that forest.
  • Different forest, different site  If the mailbox is located in a different Active Directory forest than the CAS, CAS will act differently depending on the EAS protocol version that it previously negotiated with the device:
    • If the device is using earlier versions of the protocol (EAS 12.0 and below), the connection is proxied to a CAS server in the forest where the mailbox is located.
    • If the device is using more recent versions of the protocol (EAS 12.1 and above), CAS issues a redirection request back to the device pointing it to the specific forest containing the mailbox. The device should then establish a direct connection to the new forest.

For an overview of proxying and redirection, see Understanding Proxying and Redirection in Exchange 2010 documentation.

How do devices choose which site to access?

Phones and tablets connect to Office 365 in a number of ways, depending on the device capabilities, configuration and which protocol version has been negotiated. Specifically:

  • The device may automatically discover the correct mailbox forest based on the user’s email address if the device supports the EAS Autodiscover command.
  • The user may configure the device to access a specific URL:
    • If the user enters the Office 365 endpoint URL for mobile devices (m.outlook.com), this address points the device to a number of forests that are geographically closest to user. The device then connects to one of the returned forests.
    • If the user enters a specific forest URL, the device connects to that forest.
    • If the user enters a specific site URL, the device connects directly to that site.

Office 365 contains a number of Active Directory forests, each of which contains several sites. Each forest has a default front-end site. When a device connects to a forest, it transparently connects to the front-end site for that forest.

Depending on whether the device connects to the Active Directory site where the user’s mailbox is located, the connection logic either retrieves the content directly, or proxies or redirects the device to the correct site.

Issues with redirection

More recent versions of EAS protocol support the redirection command. When a device using a more recent version of the protocol reaches a CAS in a site that doesn't contain the requested mailbox, the server responds to the request by redirecting the device to a CAS in the site hosting the active copy of the user’s mailbox. We assume that devices which advertise to the server support for EAS protocol version 12.1 and later comply with the EAS requirement to support the HTTP redirection error code.

Note: If you want to determine the Exchange ActiveSync protocol version that your device is currently using, refer to your device manufacturer’s documentation.

A problem can occur when a device claims to support redirection, but does not reliably do so. These devices cannot access the mailbox, and the user may receive a number of errors depending on the device (for example, unable to connect to server). A very small number of devices connecting to Office 365 are impacted by this failure to implement Exchange ActiveSync completely (about 1%).

Modifying the Office 365 deployment to compensate for these devices that don’t correctly support redirection would result in a degraded experience for all mobile device users. Performance for the devices is better if they connect to the correct Active Directory site directly after being redirected.

Phones and tablets that are part of the Exchange ActiveSync Logo Program support redirection and thus, do not experience this issue. We are working with a number of other manufacturers to help them support the redirection logic and fix their connectivity issues.

How to fix it?

If your users are having trouble connecting to their Office 365 mailboxes on devices that don’t fully support redirection, use one of the following methods to fix the issue:

  1. Update the Exchange server setting on your device to m.outlook.com as shown in the example below. Then, try connecting to your account and see if this change resolves the issue.
  2. If using the Exchange server name m.outlook.com does not fix the issue:
    1. Sign in to your account using Outlook Web App on a computer.
    2. Click Options in the top right corner and select See All Options… as shown below.

    3. On the My Account tab (shown below), click Settings for POP, IMAP and SMTP Access…
    4. On the page that opens, under External POP setting you'll see a server name listed.

      Use the Server name on this page for the Exchange server value on your device email configuration.

      Note: Although the setting is listed as the server name for POP, it's also an endpoint for Exchange ActiveSync.

  3. If using m.outlook.com and the External POP Settings/Server name value did not fix the issue:
    1. Go back to the main page of Outlook Web App. In the top right corner, click on the question mark next to Options and then select About as shown below.
    2. On the About page, you'll see the entry for the Host name listed.

      Use the value next to the Host name as the server setting on your mobile device.

    Note: When you use the Host name as your Exchange server setting, you may need to update the setting in the future. As I described before, the mailboxes may be moved from one site to another, and devices that do not support the redirect command correctly will lose connectivity. If your user mailbox moves due to failover or upgrades, your site name (Host name) may change and you may need to reconfigure your device to point to the new site.

  4. Another method to resolve the issue may be to try using a different email application on your mobile device. Some EAS applications are able to properly handle redirection even on a device that doesn’t support the redirection command.
More help and resources

Katarzyna Puchala

The title of this post was changed shortly after publishing. The permalink URL may differ from the post title.

Categories: Exchange feeds

Recovering Public Folders After Accidental Deletion (Part 1: Recovery Process)

February 6, 2012 - 10:00am
Overview

This two-part blog series will outline some of the recovery options available to administrators in the event that one or more public folders are accidentally deleted from the environment. The first part will explain the options, while the second part will outline the architectural aspects of public folders that drive the options.

Introduction

In older versions of Exchange, mailbox and mailbox database recovery was a long, complicated process involving backups, recovery servers, and changes to Active Directory. Successive versions of the product have introduced more and more functionality around recovery (recovery storage groups/databases, database replication, etc.), and we're now at the point where restoring a mailbox is a seemingly trivial operation, and restoring a mailbox database is almost unheard of. But mailboxes aren't the only data stored on Mailbox servers in Exchange Server 2010, and the procedure for restoring public folders and public folder databases differs greatly from the mailbox procedure.

Review of Recovery Options

The first two recovery options are detailed either in TechNet or elsewhere on the Exchange team blog site, so I'll simply list them here and then move on to the real purpose of this blog.  The recovery options for public folders and public folder databases in Exchange Server 2010 are as follows, from the easiest to the most complex:

  1. Recover deleted folders via Outlook (detailed in http://technet.microsoft.com/en-us/magazine/dd553036.aspx).

    Note: Exchange Server 2010 Service Pack 2 fixes an issue where users were unable to use Outlook to recover deleted public folders. This is another reason to upgrade your Exchange Server 2010 systems to SP2 at the earliest opportunity.

  2. Recover deleted folders via ExFolders (http://blogs.technet.com/b/exchange/archive/2009/12/04/3408943.aspx).
  3. Recover folders via public folder database restore.

The first option is the easiest and most obvious - if an end user accidentally deletes a folder, he or she should be able to undelete that folder using Outlook. Failing that, an administrator should be able to use ExFolders to recover that folder. But what if these options won't work in your situation? What if the end user didn't realize he or she deleted the folder, and a month has passed? Or what if your organization has changed the retention settings for deleted public folders, and essentially eliminated the dumpster?  How do you recover public folders in this case?

Recovery Options

At the heart of public folder recovery is a painful truth: you can't delete a public folder from the organization and recover it by simply restoring an older version of a public folder database. If you restore a public folder database from backup and place it back into production, you’ll see the public folders only until the server receives replication messages. Because the public folder hierarchy – the list of all folders in the environment – no longer includes the folders which were deleted, the target server has copies of folders which, from Exchange’s perspective, don’t exist. As soon as that public folder database receives a hierarchy update, it will see that those public folders aren’t present in the hierarchy, and the store will delete the public folder again. Since you can’t edit the hierarchy via the Public Folder Management Console (or even via adsiedit.msc), you can't manually add that public folder back in. So, given this limitation, how do we recover that public folder?

Consider the following points:

  • If you don't replicate every folder to every database, you would need to delete all current databases and then recover from backup any database that contains unique content.  This only works if you have recent backups, of course, and would also require that you export any content generated since that backup, since you’re going to delete all of the existing databases. The deletion is necessary because if a restored public folder store receives hierarchy replication from one of the existing public folder stores, the whole exercise is for naught.
  • If you do replicate all folders to all stores in the environment, you can delete all stores and just restore one database, then replicate the content from that database out to the other servers. Again, this depends on all databases having duplicate content, and you must delete all existing databases before restoring the one from backup.
  • You can restore a backup of the public folder database to an isolated Exchange environment, connect to the public folder database with Outlook, export all content to a series of PSTs, create new folders in the production environment with the same names as the deleted folders, and then import all of the content. This is obviously a somewhat manual process, and most administrators aren't going to want to do this.
Recommended Recovery Procedure

Thankfully there is a much easier process which can be performed in-place and with a minimum of fuss.

  1. Select one of the existing public folder servers in the environment. [Using an existing server simplifies the process a bit.] You will isolate this system from its replication partners, so choose a system that doesn’t serve as the source for a lot of content which needs to be replicated.
  2. Using Registry Editor, set the value of the Replication registry key (HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\MSExchangeIS\<servername>\Public- <GUID of Public Store>) to 0 (zero).

    Note: You may need to create this DWORD key if it doesn’t already exist. Further information on the Replication registry key is available in the article, “Replication does not occur for one Exchange server in the organization” (http://support.microsoft.com/kb/812294). This registry key also applies to Exchange Server 2007 and 2010.

  3. Restore the public folder database in place using your normal restoration procedure.
  4. Using an Outlook client, log onto a mailbox which uses the restored public folder database as its default public folder store (this is necessary in order to see the restored folders). If you don’t have a mailbox database which uses that public folder database as its default, either create a new mailbox database (recommended) or change an existing mailbox database to use the newly-restored public folder database.
  5. If necessary, click the Folders icon at bottom left of the Navigation screen, and then expand the public folders node.
  6. Copy each of the folders you wish to restore to another location within the public folder hierarchy. If you’re restoring an entire hierarchy, you can simply Ctrl-click and drag the root folder to make new copies of all subfolders. Although the new folders will have similar names to the originals, the underlying folder IDs (FIDs) are different.
  7. Once you’ve created copies of all of the folders, verify that the replica lists include all desired targets (and reconfigure as appropriate).
  8. At this point, it’s now safe to reintroduce that server into the production environment. To do so, dismount the public folder database, delete the Replication registry key (or set it to 1), and then remount the database.
  9. As soon as hierarchy is replicated to the server, the original folders will once again disappear, but the copies of the folders will be replicated to all replication partners.

You may need to add mail-enabled public folders back into distribution groups, as their SMTP addresses will likely be different from those on the original folders. End users will also need to recreate public folder favorites in Outlook.

Summary

Recovering from accidental public folder deletion can be difficult, especially if you don’t take hierarchy replication into account. By restoring into an isolated environment, and then cloning the folders to be restored, you can work around this limitation and restore the missing content. In the next blog entry, I’ll explain the underlying architecture of public folders (including replication, change numbers, and the replication state table) to show why these steps are so necessary.

John Rodriguez
Principal Premier Field Engineer
Microsoft Premier Support

Categories: Exchange feeds

Released: Migrating From Exchange Server 2010 in Hosting Mode to Exchange Server 2010 SP2 whitepaper

February 3, 2012 - 1:00pm

I’m very happy to be able to announce we have just made available for download a guide to help those of you intending to migrate from Exchange in /Hosting mode to Exchange 2010 SP2 installed without use of the /hosting switch.

Like the previous HMC to Exchange 2010 SP2 guidance, it contains a white paper and some PowerShell scripts. The white paper describes the migration process, and the scripts provide a starting point for your own migration toolkit. Of course the exact migration steps and methodology you will need to follow will depend upon what you have deployed, but we hope what we have provided will help you with your efforts and provide you some useful tools and information.

Check out the Migrating From Exchange Server 2010 in Hosting Mode to Exchange Server 2010 SP2 documentation.

We know any cross-forest migration can be tough, and there are also companies out there that provide migration tools and consulting, so if you feel you need more help than the guidance provides, or if you need some form of longer term co-existence, you may want to look at those offerings.

Finally, as discussed several times on this blog, building a multi-tenancy solution is a complex undertaking. We still very much are recommending that you look at existing solutions available in the market today and/or look at engaging solution integration partners to help with your solution. There are several solutions listed on our web site, and more coming, so before trying to re-invent the wheel to build your multi-tenant offering, look at what the market can offer.

Good luck with your migration!

Greg Taylor
Principal Program Manager (though not as awesome as Ross) 
Exchange Customer Experience

Categories: Exchange feeds

A script to troubleshoot issues with Exchange ActiveSync

January 31, 2012 - 3:54pm

The Exchange support team relatively frequently receives cases where mobile devices using Exchange ActiveSync (EAS) protocol send too many requests to Exchange server resulting in a situation where server runs out of resources, effectively causing a ‘denial of service’ (DOS) attack. The worst outcome of such a situation is that the server also becomes unavailable to other users who may not be using EAS protocol to connect. We have documented this issue with possible mitigations in the following KnowledgeBase article:

2469722 Unable to connect using Exchange ActiveSync due to Exchange resource consumption

A recent example of this issue was Apple iOS 4.0 devices retrying a full sync every 30 seconds (see TS3398). Another example could be some devices that do not understand how to handle a ‘mailbox full’ response from the Exchange server, resulting in several tries to reconnect. This can cause such devices to attempt to connect & sync with the mailbox more than 60 times in a minute, killing battery life on the device and causing performance issues on server.

Managing mobile devices & balancing available server resources among different types of clients can be a daunting challenge for IT administrators. Trying to track down which devices are causing resource depletion issues on Exchange 2010/2007 Client Access server (CAS) or Exchange 2003 Front-end (FE) server is not an easy task. As referenced in the article above, you can use Log Parser to extract useful statistics from IIS logs (see note below), but most administrators do not have the time & expertise to draft queries to extract such information from lengthy logs.

The purpose of this post is to introduce everyone in Exchange community to a new PowerShell script that can be utilized to identify devices causing resource depletion issue, help in spotting performance trends and automatically generate reports for continuous monitoring. Using this script you can easily & quickly drill into your users' EAS activity, which can be a major task when faced with IIS logs that can get up to several gigabytes in size. The script makes it easier to identify users with multiple EAS devices. You can use it as a tool to establish a baseline during periods of normal EAS activity and then use that for comparison and reporting when things sway in other directions. It also provides an auto-monitoring feature which you can use to receive e-mail notifications.

Note: The script works with IIS logs on Exchange 2010, Exchange 2007 and Exchange 2003 servers.
All communication between mobile devices using EAS protocol and Microsoft Exchange is logged in IIS Logs on CAS/FE servers in W3C format. The default W3C fields enabled for logging do vary between IIS 6.0 and 7.0/7.5 (IIS 7.0 has the same fields as 7.5). This script works against both versions.

IIS Logs

Because EAS uses HTTP, all EAS requests are logged in IIS logs, which is enabled by default. Sometimes administrators may disable IIS logging to save space on servers. You must check whether logging is enabled or not and find the location of log files by following these steps:

IIS 7

  1. In IIS Manager, expand the server name i.e. ExchangeServer (Contoso\Administrator)
  2. In the Features View, double click Logging in the IIS section.

IIS 6

  1. In IIS Manager, right click the web site name (for most it should be Default Web Site) and choose Properties
  2. Click on the Web Site tab.
What are mobile devices responsible for in communications with the server?

Before we delve into the specifics of the script, let's review some important requirements for mobile devices that use EAS to communicate with Microsoft Exchange.

  • When a mobile device is returned an unexpected response from server, it's up to the device to handle the response and retry appropriately at a reasonable interval. Additionally, devices are responsible for handling timeouts that happen outside of IIS, which may be caused by network latency.
  • With each request a device sends to IIS/Exchange, it should also report the User-Agent.
What will you see when you use this script?

The script utilizes Microsoft Log Parser 2.2 to parse IIS logs and generate results. It creates different SQL queries for Log Parser based on the switches (see table below) you use. A previous blog post Exchange 2003 - Active Sync reporting talking about Log Parser that touches on similar points. The information in that post still applies to Exchange 2010 & 2007. Since that blog post, additional commands were added to EAS protocol), which are also utilized by this new script while processing the logs.

Here's a list of the EAS commands that the script will report in results:

Sync, SendMail, SmartForward, SmartReply, GetAttachment, GetHierarchy, CreateCollection, DeleteCollection, MoveCollection, FolderSync, FolderCreate, FolderDelete, FolderUpdate, MoveItems, GetItemEstimate, MeetingResponse, Search, Settings, Ping, ItemOperations, Provision, ResolveRecipients, ValidateCert

For more details about each EAS command, see ActiveSync HTTP Protocol Specification on MSDN.

In addition to these commands, the following parameters are also logged by the script.

  1. User
  2. User Name
  3. Device Type
  4. Device ID
  5. User-Agent
  6. sc-bytes: This is only available if you have enabled this tag in IIS logging.
  7. cs-bytes:This is only available if you have enabled this tag in IIS logging.
  8. time-taken (in milliseconds): This is only available if you have enabled this tag in IIS logging.
  9. Total number of requests or requests by Device ID
  10. Total number of all 4xx status codes
  11. Total number of all 5xx status codes (for more info, see KB: 318380 for IIS 6.0 & KB: 943891)
  12. 409 status codes: 409 (Conflict) - A collection cannot be made at the Request-URI until one or more intermediate collections have been created. The server MUST NOT create those intermediate collections automatically (Ref: RFC 4918)
  13. 500 status codes: After device sends OPTIONS command, it’s possible to get a 500 response back from server with ‘MissingCscCacheEntry’ error. This can happen as a result of an issue with the affinity where you have an Internet-facing CAS array proxying a request to an Internal CAS array. When the Internet-facing array sends the request to the Internal array, a CAS server will answer with the first 401. In the next communication, the request is handled by a different CAS server in the Internal array. Resolving the affinity issue with the Internal CAS array is the solution.
  14. 503 status codes: The server is currently unable to handle the request due to a temporary overloading or maintenance of the server. The implication is that this is a temporary condition which will be alleviated after some delay. If known, the length of the delay MAY be indicated in a Retry-After header. If no Retry-After is given, the client SHOULD handle the response as it would for a 500 response.

    Note: The existence of the 503 status code does not imply that a server must use it when becoming overloaded. Some servers may wish to simply refuse the connection. (Ref: RFC 2616)

  15. 507 status codes: The 507 (Insufficient Storage) status code means the method could not be performed on the resource because the server is unable to store the representation needed to successfully complete the request. This condition is considered to be temporary. If the request that received this status code was the result of a user action, the request MUST NOT be repeated until it is requested by a separate user action. (Ref: RFC 4918)
  16. 451 status codes: Exchange 2007/2010 returns an HTTP 451 response to an EAS client when it determines that the device should be using a ‘better’ CAS for EAS connectivity. The logic used to determine ‘better’ CAS is based on Active Directory sites and whether a CAS is considered ‘Internet-facing’. If the ExternalUrl property on the Microsoft-Server-ActiveSync virtual directory is specified, then that CAS is considered to be Internet-Facing for EAS connectivity. (Ref: TechNet articles Exchange ActiveSync Returned an HTTP 451 Error and Understanding Proxying and Redirection)
  17. TooManyJobsQueued errors: For more info on ‘TooManyJobsQueued’ please refer to KB: 2469722 referenced above
  18. OverBudget: A budget is the amount of access that a user or application may have for a specific setting. A budget represents how many connections a user may have or how much activity a user may be permitted for each one-minute period. (Ref: TechNet article)
  19. Following subset of Common Status Codes: InvalidContent, ServerError, ServerErrorRetryLater, MailboxQuotaExceeded, DeviceIsBlockedForThisUser, AccessDenied, SyncStateNotFound, DeviceNotFullyProvisionable, DeviceNotProvisioned, ItemNotFound, UserDisabledForSync
What can you do with this script?

You can process logs using this script to retrieve the following details:

  1. Hits by user/Device ID (users/devices with maximum number of requests sent to server)
  2. Hits per hour/day (helps in determining the frequency of requests sent by user/device, time value is entered in seconds)
  3. Hits by device with specified threshold limit (here you can specify a limit for hits/requests, i.e. all users who are sending 1000 requests per hour/day, etc.)
  4. CSV export of results
  5. HTML report of results
  6. E-mail reports for monitoring (CSV/HTML formats)
Prerequisites:

Please make sure you have the following installed on your machine before using this script:

Script Parameters ParameterRequiredTypeDescription ActiveSyncOutputFolder Required System.String CSV and HTML output directory ActiveSyncOutputPrefix Optional System.String Prefixes string to the output file name CreateZip Optional System.Management.
Automation.SwitchParameter Creates a ZIP file. Can only be used with SendHTMLReport CreateZipSize Optional System.In32 Threshold file size. The Default is 2MB. Once this has been exceeded the file will be compressed. Requires SendHTMLReport and CreateZip to be true Date Optional System.String Specify a date to parse on. Enter date in the format: MM-DD-YYYY DeviceId Optional System.String Active Sync Device ID to parse on DisableColumnDetect Optional System.Management.
Automation.SwitchParameter Disables the ability to add additional columns to the report that users may have enabled, Example: time-taken

Note: If you are running against multiple files that may have different W3C headers this switch should be used. Help Optional System.Management.
Automation.SwitchParameter Outputs switch descriptions ReportBySeconds Optional System.Int32 Generates the report bases in the value entered in seconds Hourly Optional System.Management.
Automation.SwitchParameter Generates the report on a per hourly basis HTMLReport Optional System.Management.
Automation.SwitchParameter Creates an HTML Report HTMLCSVHeaders Optional System.String

IIS CSV Headers to Export on in the –HTMLReport.

Defaults: "DeviceID,Hits,Ping,Sync,FolderSync,DeviceType,User-Agent"

IISLogs Required System.Array

IIS Log Directory.
Example.- IISLogs D:\Server,'D:\Server 2'

LogParserExec Required System.String Path to LogParser.exe MinimumHits Optional System.Int32 Minimum Hit Threshold value where the report will generate on CSV and HTML SendEmailReport Optional System.Management.
Automation.SwitchParameter Enable Email reporting SMTPRecipient Optional System.String SMTP Recipient SMTPSender Optional System.String SMTP Sender SMTPServer Optional System.String SMTP Server TopHits Optional System.Int32

Top Hits to return.
Example: TopHits 50, This cannot be used with Hourly or ReportBySeconds

How do you use the script?

Below are some examples (with commands) on how you can use the script and why you might use them.

Hits greater than 1000

The following command will parse all the IIS Logs in the folder W3SVC1 and only report the hits by users & devices that are greater than 1000.

.\ActiveSyncReport.ps1 -IISLog "C:\inetpub\logs\LogFiles\W3SVC1" -LogparserExec “C:\Program Files (x86)\Log Parser 2.2\LogParser.exe” -ActiveSyncOutputFolder c:\EASReports -MinimumHits 1000

[In above command, script ‘ActiveSyncReport.ps1’ is located at the root of C drive, -IISLog switch specifies the default location of IIS logs, -LogparserExec switch points to the location of Log Parser executable application file, -ActiveSyncOutputFolder switch provides the location where output or result file needs to be saved, MinimumHits with a value of ‘1000’ is the script parameter explained in the above table]

Output:

Usually if a device is sending over 1000 requests per day, we consider this ‘high usage’. If the hits (requests) are above 1500, there could be an issue on the device or environment. In that case, the device & its user’s activity should be further investigated.

As a real world example, in one case we noticed there were several users who were hitting their Exchange server via EAS a lot (~25K hits, 1K hits per hour) resulting in depletion of resources on the server. Upon further investigation we saw that all of those users’ requests were resulting in a 507 error on mailbox servers on the back-end. Talking to those EAS users we discovered that during that time period they were hitting their mailbox size limits (25 MB) & were trying to delete mail from different folders to get under the size limit. In such situations, you may also see HTTP 503 (‘TooManyJobsQueued’) responses in IIS logs for EAS requests as described in KB: 2469722

Isolating a specific device ID

Here the following command will parse all the IIS Logs in the folder C:\IISLogs and will look for the Device ID xxxxxx and display its hourly statistics.

.\ActiveSyncReport.ps1 -IISLog " C:\inetpub\logs\LogFiles\W3SVC1" -LogparserExec “C:\Program Files (x86)\Log Parser 2.2\LogParser.exe” -ActiveSyncOutputFolder c:\EASReports –DeviceID xxxxxx -Hourly

Output:

With the above information you can pick a user/device and see the hourly trends. This can help identify if it’s a user action or a programmatic one.

As a real world example, in one case we had to find out which devices were modifying calendar items. So we looked at the user/device activity and sorted that by different commands they were sending to the server. After that we just concentrated on which users/devices were sending ‘MeetingResponse’ command and its frequency, time period & further related details. That helped us narrowing the issue to related users and their calendar specific activity to better address the underlying calendaring issue.

Another device related command & error to look for is ‘Options’ command and if it does not succeed for a device then the HTTP 409 error code is returned in IIS log.

Isolating a single day

The following command will parse only the files that match the date 12-24-2011 in the folder W3SVC1 and will only report the hits greater than 1000.

.\ActiveSyncReport.ps1 -IISLog "C:\inetpub\logs\LogFiles\W3SVC1" -LogparserExec “C:\Program Files (x86)\Log Parser 2.2\LogParser.exe” -ActiveSyncOutputFolder c:\EASReports -MinimumHits 1000 –Date 12-24-2011

Output:

With the above information you can identify users sending high number of requests. Also, within the columns, you can see what kind of commands those users are sending. This helps in coming up with more directed & efficient troubleshooting techniques.

What Should You Look For?

When analyzing IIS logs with the help of script, you should look for one specific command being sent over and over again. The frequency of particular commands being sent is important, any command failing frequently is also very important & one should further look into that. We should also look & compare the wait times between the executions of certain commands. Generally, commands taking longer time to execute or resulting in delayed response from server will be suspicious & should be further investigated. Keep in mind though, the Ping command is an exception as it takes longer to execute and you will see it frequently in the log as well, which is expected.

If you notice continuous failures to connect for a device with an error code of 403 that could mean that the device is not enabled for EAS based access. Sometimes mobile device users complain of connectivity issues not realizing that they’re actually not entering their credentials correctly (understandably it’s easy to make such mistakes on mobile devices). When looking thru logs, you can focus on that user & may find that user’s device is failing after issuing the ‘Provision’ command.

Creating Reports for Monitoring

You may want to create a report or generate an e-mail with such reports and details of user activity.

The following command will parse all the IIS Logs in the folder W3SVC1 and will only report the hits greater than 1000. Additionally it will create an HTML report of the results.

.\ActiveSyncReport.ps1 -IISLog "C:\inetpub\logs\LogFiles\W3SVC1" -LogparserExec “C:\Program Files (x86)\Log Parser 2.2\LogParser.exe” -ActiveSyncOutputFolder c:\EASReports -MinimumHits 1000 -HTMLReport

The following command will parse all the files in the folders C:\Server1_Logs and D:\Server2_Logs and will also email the generated report to ‘user@contoso.com’.

.\ActiveSyncReport.ps1 -IISLog "C:\Server1_Logs",”D:\Server2_Logs” -LogparserExec “C:\Program Files (x86)\Log Parser 2.2\LogParser.exe” -ActiveSyncOutputFolder c:\EASReports -SendEmailReport -SMTPRecipient user@contoso.com –SMTPSender user2@contoso.com -SMTPServer mail.contoso.com

We sincerely hope our readers find this script useful. Please do let us know how these scripts made your lives easier and what else can we do to further enhance it.

Konstantin Papadakis and Brian Drepaul

Special Thanks to:
M. Amir Haque, Will Duff, Steve Swift, Angelique Conde, Kary Wall, Chris Lineback & Mike Lagase

Categories: Exchange feeds

.PST, Time to Walk the Plank

January 30, 2012 - 1:00pm

Ask and ye shall receive, mateys!

As we announced in July, we are always looking for new ways to make your work easier - especially when your work involves ending PST proliferation. Today, we are happy to announce that PST Capture is now available as a free download.

PST Capture helps you search your network to discover and then import .pst files across your environment - all from a straightforward admin-driven tool. PST Capture will help reduce risk while increasing productivity for your users by importing .pst files into Exchange Online or Exchange Server 2010 - directly into users' primary mailboxes or archives.

In addition to all the positive feedback you have given us regarding the Archiving, Retention, Legal Hold and Discovery capabilities of Exchange, you made it clear that PST import is an important area for us to focus on moving forward. As we looked at the best ways to address this challenging need, we saw the great work that ISV partner, Red Gate, has done with their stellar solution. We determined that acquiring this product from Red Gate as a starting point was the best strategy to ensuring a quality product for you.

We put Red Gate’s tool through further feature development and a rigorous testing process that included beta testing with customers, passing through our internal product security gates, and overall quality assurance. It’s now ready for prime time and available as a free download here! For even more insight, watch the video below

And thus, we offer you PST Captarrrrrrrrrgh - or PST Capture, for those more refined than I.

As always, keep the feedback coming!

Ankur Kothari

Red Gate creates ingeniously simple software tools used by more than 500,000 IT professionals worldwide. The company works to uplift the market it serves through free web community sites, technical publications and conference sponsorships that reach millions annually.

Categories: Exchange feeds

Released: Update Rollup 6 for Exchange 2007 Service Pack 3

January 26, 2012 - 1:40pm

Earlier today the Exchange CXP team released Update Rollup 6 for Exchange Server 2007 SP3 to the Download Center.

Note: The post title erroneously referred to Update Rollup 3. It has been updated to reflect the correct rollup number.

This update contains a number of customer-reported and internally found issues since the release of RU5. See KB 2608656: Description of Update Rollup 6 for Exchange Server 2007 Service Pack 3' for more details.

We would like to specifically call out the following fixes which are included in this release:

  • DST Cadence Release for Dec 2011 - Exchange 2007
  • 22656040 An Exchange Server 2007 Client Access server may respond slowly or stop responding when users try to synchronize the Exchange ActiveSync devices with their mailboxes
  • 2498852 "0x80041606" error message when you perform a prefix search by using Outlook in online mode in an Exchange Server 2007 environment
  • 22653334 The reseed process is unsuccessful on the SCR passive node when the circular logging feature is enabled in an Exchange Server 2007 environment
  • 22617784 Journal reports are expired or lost when the Microsoft Exchange Transport service is restarted in an Exchange Server 2007 environment
  • 2289607 The week numbers displayed in OWA do not match the week numbers displayed in Outlook for English users and French users in an Exchange Server 2007 environment
General Notes:

For DST Changes: http://www.microsoft.com/time.

Note for Forefront Protection for Exchange users  For those of you running Forefront Protection for Exchange, be sure you perform these important steps from the command line in the Forefront directory before and after this rollup's installation process. Without these steps, Exchange services for Information Store and Transport will not start after you apply this update. Before installing the update, disable ForeFront by using this command: fscutility /disable. After installing the update, re-enable ForeFront by running fscutility /enable.

Exchange Team

Categories: Exchange feeds

Upcoming Events

  • No upcoming events available