On-Prem Storage To Azure Storage - Part-1

Project requirements 

Designed and implemented a cost-effective, end-to-end Azure-based storage solution, successfully migrating users and associated application dependencies with zero downtime. The project involved the seamless transfer of 10TB of data while preserving NTFS permissions and ACLs and ensuring that critical Excel macros remained fully functional. This was particularly vital as Fisher Funds manages investments across bonds and equities via Bloomberg integrations. The resulting solution delivered modern security, enhanced reliability, and operational resilience, all while maintaining optimal cost efficiency.



Challenges and Preparation

1. NTFS/ACL Enumeration and Access

During the assessment phase, I identified that file share access across all departments was configured using direct user-based permissions rather than AD group-based access. I initiated a comprehensive access remediation project to align permissions with best practices. To achieve this, I utilized CJWDEV NTFS Permission Reporter to perform a recursive scan of all file shares, capturing detailed reports of user-level permissions. Using advanced Excel analysis, I mapped access data by users and departments to identify ownership and access patterns. I then engaged with departmental managers to validate required access levels and designed a new AD group structure aligned with organizational roles. Subsequently, I implemented the new model by removing direct user permissions and assigning users to appropriate AD groups, ensuring a seamless transition with no disruption to access. Links - Cjwdev

As a precautionary measure, I utilized the NTFS ACLs reporting tool to conduct a detailed investigation of the file and folder structure. The objective was to identify any permission changes within the defined path depth—both at the folder and file level (noting that file-level analysis takes longer to process)—and to enable a thorough review of the actual access rights. Links - NTFS ACLs

2. Excel Macros and UNC Paths

Due to the nature of the business, the Investment Team heavily relied on complex Excel workbooks containing advanced computational formulas and interlinked macros referencing data across multiple files. A significant issue was identified where many of these macro-enabled workbooks contained hardcoded UNC paths within their workbook links. This posed a high risk — any change to the UNC paths could break the links, disrupt formula dependencies, and potentially lead to data inaccuracies or investment calculation anomalies. Such disruptions could have severe downstream impacts, as the Investment Team’s trading decisions for shares and bonds in global markets depend on these figures. To mitigate this risk, I developed a PowerShell-based solution that automatically scanned all Excel macro files within the Investment shared directory, extracted and analyzed their embedded workbook links, and generated a detailed report identifying files with hardcoded UNC paths. This report enabled precise remediation by highlighting exactly which files required link path updates, ensuring data integrity and business continuity. You can use this script for your environment, just modify the UNC path. 

# Install the module if you haven't already

Install-Module -Name ImportExcel -Scope CurrentUser -Force -AllowClobber

# Import the module

Import-Module -Name ImportExcel -DisableNameChecking

 $path = "\\ff-az-afs-01.internal.local\data\Portfolio Data\Middle Office"

$excelSheets = Get-ChildItem -Path $path -Include *.xls,*.xlsx -Recurse

$excel = New-Object -ComObject Excel.Application

$excel.Visible = $false

 $results = @()

 foreach ($excelSheet in $excelSheets) {

    $workbook = $excel.Workbooks.Open($excelSheet.FullName)

    $sheetName = $excelSheet.Name

    Write-Host "Excel Sheet: $sheetName"

     foreach ($link in $workbook.LinkSources(1)) {

        $linkInfo = [PSCustomObject]@{

            "FilePath" = $excelSheet.FullName

            "Sheet"    = $sheetName

            "Link"     = $link

        }

        $results += $linkInfo

    }

    $workbook.Close()

}

 $excel.Quit()

$excel = $null

[GC]::Collect()

[GC]::WaitForPendingFinalizers()

 # Export results to Excel

$results | Export-Excel -Path "C:\temp\output.xlsx" -AutoSize -ClearSheet -WorksheetName "Links"








3. Intune Limitation

As I was building a complete end-to-end solution, I couldn’t overlook drive mapping. The complexity increased when I discovered that Fisher Funds did not use Group Policies. At that time, Microsoft Intune had several limitations — one of them being the inability to natively configure drive mappings, as the required architectural layer was yet to be developed by Microsoft.

Thanks to Rudy Ooms from the Call4Cloud.nl community, a solution was available by creating custom ADMX templates that could be imported into Intune. These templates enable tenant administrators to create configuration policies for drive mapping. There are two ways to import these ADMX templates — either through Intune ADMX import or via Configuration Service Provider (CSP).

The whole documentation and files are available here - Links Call4Cloud.nl

For my project, I used the first option; however, it’s crucial to understand how Intune and CSP work under the hood. CSP stands for Configuration Service Provider. While some may assume Intune itself is a CSP, that’s incorrect. Intune is an MDM (Mobile Device Management) service, whereas CSPs are components of the Windows operating system — similar to how a Client-Side Extension (CSE) functions for Group Policy. CSPs provide the interface that allows IT administrators to apply device-specific settings to Windows Endpoints. Intune simply delivers those settings.

Windows devices enrolled in Intune use a specific synchronization mechanism with the tenant. Each enrolled device has a Task Scheduler job created with a long random string. Among the multiple schedules within it, “Schedule #3” plays a critical role in policy deployment. Once users are added to the appropriate AD groups and the device syncs, the initial sync typically occurs upon reboot.

However, in scenarios where the scheduled sync doesn’t run, or when configuration changes are made (such as modifying a drive letter or removing a drive mapping), devices will take up to 8 hours to fetch updated policies. This delay is by design. During migration or testing, you can manually trigger “Task #3” to expedite synchronization and accelerate deployment validation.


 

I encountered a particularly unusual configuration — their IT department had assigned the “A” drive letter to the Accounting department’s shared drive. By design and best practice, the letters A, B, and C are traditionally reserved for system use (historically for floppy and system drives). Since modern ADMX templates are built on this convention, the drive mapping policies begin from drive letter D onwards.

This presented a challenge: either reassign a new drive letter during the migration and risk breaking existing dependencies, or find a way to retain the “A” drive mapping. After careful consideration, I devised a solution to modify both the ADMX and ADML files — manually adding drive letter codes for A, B, and C into the XML schema. Once updated, I re-ingested these customized templates into Intune, which successfully allowed me to create device configuration policies that mapped the network drives referencing Azure Files mount points.


4. Operating System and File Server Configurations

The legacy file servers were configured with both Data Deduplication and Shadow Copy features. Data Deduplication is a storage optimization technique that reduces disk usage by identifying and eliminating duplicate copies of data, replacing them with pointers to a single instance stored in a deduplication chunk store. This chunk store contains unique file segments ("chunks") referenced by reparse points within the original file structure.

Before migrating data to Azure Files, it was critical to ensure that we transferred complete and original file data, not deduplicated (trimmed) placeholders. To achieve this, I executed a controlled process to disable deduplication on each file server volume. This involved running the deduplication unoptimization process, followed by a scrubbing operation to unindex the chunk store and restore all original files to disk.

It is essential to verify sufficient free disk space before initiating unoptimization, as the process temporarily expands storage requirements while reconstructing full, uncompressed files.

Deduplication commands that you need to execute for each volume in your env.

   




Before (with dedup)


After (without dedup)

As you can see, the space consumed by the K drive has increased. Almost 162Gb of storage space was saved by using deduplication and compression. Unfortunately, I have to disable dedup because I do not want to migrate broken data. Data integrity is imperative for any migration. 

You could read more about deduplication on the following links - 
Microsoft Link-1: Gilbert Blog Link-2 : Dedup ChuckStore Link-3 : Deployment Research Link-4

Continuation PART - 2 

Popular posts from this blog

On-Prem Storage To Azure Storage - Part-2

Secure Boot vCenter Deployment