April 20th, 2026

The AI Skills Shift: Mapping Automation and Augmentation Pathways

This research paper introduces the Skill Automation Feasibility Index (SAFI) to evaluate how effectively artificial intelligence can perform specific professional tasks. By benchmarking four leading large language models against various occupational skills, the authors discovered that AI excels at mathematics and programming but struggles with nuanced human abilities like active listening. Interestingly, the study reveals a "capability-demand inversion," where the skills most required in AI-exposed roles are currently the ones models perform least effectively. The findings suggest that AI is predominantly functioning as a collaborative tool for augmentation rather than a total replacement for human workers. Consequently, the authors propose an AI Impact Matrix to help policymakers and educators navigate workforce transitions and targeted reskilling. These results highlight that while technical roles face higher displacement risks, communication-heavy professions are evolving through human-AI partnership.

Research paper link

Transcript

Follow every word

Um,soimaginehiringabrilliantmathematician,right?Likesomeonewhocanwriteflawlesssoftwarecodeinseconds,butthentheycompletelyfailabasicreadingcomprehensiontest.Yeah,itsoundsabsurd,butthatisactuallyexactlywhatthemodernworkforceisdoingrightnowwithartificialintelligence.Exactly.Imean,theprevailinganxietyforanyonenavigatedthecurrenteconomyisreallyjustonebigquestion.WillAItakemyjoborjustchangeitcompletely?Right.Andwehearendlessspeculationaboutthat.Butum,verylittleofitisactuallygroundedinhowthesemodelsfunctiononagranularlevel.Yeah,wehearalotofnoise,butnotalotofhardnumbers.Sotoday,forthisdeepdive,we'vegotourhandsonabrandnewstudy.It'sfromApril2026outofsavagerybyPhilPoonUniversity.Right.It'scalledtheAIskillshift.Yes.Andthewholemissionhereistomapthehardempiricaldatadirectlytoyourcareerbecausethetimingofthisdata,Imean,itcouldn'tbemorecritical.Oh,absolutely,especiallysincethepeopleactuallyrunningtheglobaleconomycannotseemtoagreeonwhatishappening.Likeweareseeingthisverypublic,highlyfractureddebateamongindustryleaders.Theconsensusistotallynon-existentrightnow.Imean,therearethreereallyprominentviewpointsdominatingfinanceandtech.Right.Startingwiththereallyaggressiveside.Yeah.YouhaveJPMorgan'sJamieDiamond.HewasattheHillinValleyForumbackinMarch2026.Andhebasicallysaid,AIdisplacementisanimmediatestructuralshift.Likeit'shappeningtodayandweneedmassiveretrainingeffortsrightnow.ButthenyouhaveDavidSolomonatGoldmanSachswhoisinamuchmoremoderatecamp.Exactly.Heexplicitlysaidheisn'tpredictingsomesortofjobapocalypse.Hethinkstheeconomyisnimbleenoughtoabsorbthetech.Thoughhedoesadmittheshorttermdisruptionisgoingtobeprettyturbulent.Right.Andthendumpingtothefurthestendofthespectrum,youhaveDarioAmade.TheCEOofAnthropic.Right.Andhiswarningwasjuststark.Hesaidupto50%ofentry-levelwhitecollarjobscouldjustvanishwithinthenextfiveyears,whichisamassivenumber.Soyouliterallyhavethetopmindsandfinanceandtechentirelysplit.Imean,it'seitheracompletehollowingoutoftheworkforce,amanageabletransitionoraretrainingemergency.Yeah.Buthere'sthething.DiamondandSolomon,they'redebatingfromthis,like10,000footmacroview.Exactly.Andwhat'sfascinatinghereisthatdespitethesemassiveCEOpredictions,veryfewpeopleareactuallytestingAIagainstthestandardizedskillsthatmakeuprealjobs.Becauseanoccupationisn'tjustonebigmonolithicthing,right?It'sabundleofreallyspecifictasks.Exactly.AndtheresearchersatPoonUniversityrealizedthat.Soinsteadofasking,canAIdothisjob?Theybenchmarkedfourfrontier,largelanguagemodelsagainstthe35specificskillsdefinedbytheUSDepartmentofLabor'sOnetTaxonomy.TheybasicallygavetheAIastandardizedtestforeverycorecomponentoftheAmericanworkforce.Right.AndtheycodifiedallthisintowhattheycalltheskillautomationfeasibilityindexortheSFAscore.Okay.Let'sunpackthisbecausethesetupforthistestwasprettyintense,right?Veryintense.Themethodologyisincrediblyrobust.Theythrew263distincttext-basedtasksatfourmassivefrontiermodels.Andthesearethebigones.AlamaMay3.370B,MistralLarge,Quinn2.572B,andGemini2.5Flash.Yep.Acrossallthosetasksandmodels,theresearchersmade1,052totalmodelcalls.Andwhatimmediatelystandsoutjustfromareliabilitystandpointisthe100%completionrate.Wait,100%likenocrashesatall.Zerosystemfailures,zerotimeouts.Themodelsexecutedeverysinglecommandtheyweregiven.Wow.Okay,butexecutingacommandandactuallydoingitwellaretwodifferentthings.Oh,completely.Thequalityofexecutionvariedwildlydependingontheskill.Right.SotheSFAscoresareonazeroto100scale.Andwhentheresultscameinmathematicsandprogrammingjustabsolutelydominated.Yeah,mathematicsscoreda73.2andprogrammingcameinat71.8.Butthenyoulookattheabsolutebottomoftheindex.Activelisteningscoreto42.2andReadingComprehensionsaiditjust45.5.Which,youknow,soundsabitcrazyatfirst.Itiscrazy.Ifinditsocounterintuitive.Imean,wearetalkingaboutlargelanguagemodels.Theirentirearchitectureisliterallybuiltonprocessingtext.Right.Sohowdoesatext-basedmachinefailatreadingcomprehension?It'slikefindingoutacalculatorisageniusatcalculus,butcan'tfigureoutarestauranttip.Whyisitfailingatthesocialcommunicativestuff?Well,itreallycomesdowntothefundamentaldifferencebetweenstructuredquantitativereasoningandnuancedhumancommunication.Youbreakthatdownforme.SoLLMsattheircorearejustincrediblysophisticatednexttokenpredictionengines.Theycalculatetheprobabilisticsequenceofwordsbasedontheirtrainingdata.Right.They'reguessingwhatcomesnext.Exactly.Sotheyexcelatmathandcodingbecausethoseareboundeddeterministicsystems.Theyhavestrictrules.Ifamathematicalequationbalances,itbalances.Yeah,programmingsyntaxisabsolute.There'snograyarea.Right.Butactivelisteningandreadingcomprehension,thoserequireaninternalstateofmindthatthemodeljustsimplydoesnotpossess.Oh,Isee.SousinganAIforactivelisteningiskindoflikeusingaTPStodetermineifyourpassengersenjoyingtheroadtrip.Thatisaperfectanalogy.Yeah.LiketheGPScanperfectlymaptheroute,thestructuredata,butithasabsolutelynosensorsforthehumanexperiencehappeninginsidethecar.Exactly.Becauseactivelisteninginarealprofessionalenvironment,itinvolvesinferringunspokensubtext.It'sinterpretingsocialdynamicsandemotionalshifts.Andthemodelcan'treadbetweenthelinesbecauseithasnolivedexperiencetodrawanycontextfrom.Thatisanincrediblyprecisewaytolookatit.Itcanprocessthesyntaxoftheconversation,butittotallymissestheintent.Wow.Andthismechanisticlimitationactuallyleadsdirectlytoanotherhugefindinginthestudy,whichtheycallmodelconvergence.Right.Becausetheytestedmodelsfromallovertheplace.Yeah.Andentirelydifferentarchitectures,differenttrainingdata,differentcountries,LamaandGeminiarefromtheUS,Mistral'sfromFrance,andQuinnisfromChina.Andyetdidn'ttheyallscorebasicallythesame?Prettymuch.Allfourmodelsscoredwithinaremarkablytight3.6pointspreadacrosstheboard.Mistrallargeledslightlywitha60.0average,buttheiractualcapabilityprofileswerevirtuallyidentical.Soforanyonelisteningrightnowwhohastomakeworkforceorsoftwarepurchasingdecisions,thisishuge.Itreallyis.Becauseitmeanstext-basedautomationdependsentirelyontheinherentnatureofthetask,notofwhichspecificvendoryoubuyfrom.Exactly.Youdon'tneedtocompletelyoverhaulyourworkfloweverytimeanewmodeldrops.Right.Becauseatthat70billionparameterfrontier,which,justasareminder,isameasureofthemodel'sneuralnetworksize,they'reallhittingtheexactsameceilingwhenitcomestounstructuredhumannuance.Themodelsarestandardizing,buttherealinsightcomeswhenwelookathowthosetestscorestranslatetoactualofficesandfactoriesrightnow.Yeah,becausetheresearcherscrossreferencedtheirsafeeyescoreswiththeanthropiceconomicindex,right?Theydid.Andthatindextrackedreal-worldAIadoptionacross756differentoccupations.Andhere'swhereitgetsreallyinteresting.BecausewhenIwasreadingthroughthatanthropicdata,theparadoxwasjustglaring.Thecapability,demand,andversion.Yes.TheparadoxisthattheoccupationswiththehighestAIexposurescores.Sothejobswhereworkersareusingconstantlythroughouttheday,theyareheavilyconcentratedintheexactsameskillstheAIscoredtheabsoluteloweston.It'swildtoseeitinthedata.Right.Readingcomprehension,writing,andactivelisteningarethedominantskillsinthesehighlyAIexposedjobs.Wait,ifAIissoterribleatactivelisteningandreadingcomprehension,whyarethejobsthatrequirethoseskillstheonesusingAIthemost?Itseemslikeacontradiction,right?Totally.Isn'tthatlikehiringaninternwhocan'treadtorunyourbookclub?Well,ifweconnectthistothebiggerpicture,itactuallymakesperfectsensewhenyoubreakdowntheworkflow.Let'slookatcustomerservicerepresentatives.TheyhaveamassiveAIexposurescoreof.701.Andmarketresearchanalystsareuptheretooat.648.Exactly.AndthecorevalueofthoserolesreliesheavilyontheexactnuancedhumancommunicationwejustsaidAIisterribleat.Acustomerservicerephastodeescalateanangryclient.Yeah.Andamarketresearchanalysthastofigureouthumanbuyingpsychology.Right.ButthekeyhereisthattheAIisn'tdoingthedeescalation.TheAIisaproductivitytoolhandlingthestructureddataretrievalthatusedtoslowthehumandown.Oh,Isee.Sowhilethecustomerisyellingonthephone,theAIisinstantlypullinguptheirpurchasehistory,crossreferencingwarranties,andlikeformattingarefundtable.Exactly.Itdoestheheavyliftingonthestructureddeterministicside.Thatfreesupthehuman'scognitivebandwidthtoactuallypracticeactivelisteningandmanagetheemotionalsideofthecall.Okay.Thatmakessomuchsense.SothehumanandtheAIaredividingthelaborbasedonwhattheirarchitecturesareactuallygoodat.TheAIdoesthesyntaxandthehumandoesthesentiment.Precisely.Andtoseeifthisholdsupatscale,theresearchersanalyze3,364specificreal-worldtaskinteractions.Andwhatdidtheyfind?Isitmostlyautomationorarehumansstillinvolved?Thedatastronglyrefutesthenarrativeoftotalimmediateautomation,nearly80%,78.7%tobeexact,ofAIinteractionsorwhattheycallaugmentation.Wow.Soonly21.3%aretrueautomation.Yep.Thevastmajorityarecollaborativefeedbackloops.Thehumanasksforadraft.TheAIgeneratesit.Thehumancorrectsthetone.TheAIrevisesit.It'sinteractive.Completely.Itisrarelyadirectiveend-to-endtaskcompletionwhereyoujustpushabuttonandwalkaway.Okay.ThatbringsusrightbacktoDavidSolomon'sexampleatGoldmanSachs.TheIPOprospectusexample.Yes.Right.SowepointedoutthatAIcannowdraft95%ofaninitialpublicofferingprospectus,theS1filinginjustminutes.Whichisincrediblebecausehistoricallythatdocumentrequiredasix-personteamworkinggruelinghoursfortwosolidweeks.Yeah.Buthenotedthatthelast5%isthehumanjudgment.Andhesaysthatisnowthemostcritical,well,wait,letmepushbackonthisabit.Yeah.IfanAIisdoing95%ofthedraftingthatasix-personteamusedtodo,aren't95%ofthosesixpeopleoutofajob?Isthisaugmentationjustafancycorporatewordforyourfired,butyourbosshasanewtoy?Thatisthenaturalinstinct,butitmissesaccrucialnuanceaboutwhitecollarwork.DraftingthetextofanS1filingisonlyafractionoftheactualjob.Becauseoftheliability,right?Exactly.AnIPOprospectuscarriesmassivelegalliability.AnAIcanaggregatethefinancialtablesandspitoutboilerplateriskfactors,butanAIcannotabsorblegalrisks.Right.BecauseoftheSECauditsthefiling,youcan'texactlysendthelanguagemodeltojail.Exactly.Iftheframingmisleadsshareholders,thehumanexecutivesgotofederalprison.So,forthoseentry-levelworkers,DarioModewassoworriedabout,theshiftisn'tnecessarilyfromemploymenttounemployment.It'sashiftinwhattheyactuallydoallday.Yes.Itmovesfromroutinecreationtoverificationandliabilityabsorption.YoucannotsubmitanAI-generatedS1withoutexhaustivehumanauditing.So,theworkerswhothrivearen'ttheonesmanuallytypingtablesthefastestanymore.No.Theoneswhothrivearetheoneswholearntodirect,evaluate,andrefinetheAIoutputs.Theyspotthehallucinationsandalignthedocumentwithlegalboundaries.So,thejobstays,buttherequiredskillsetshiftsdrastically.Andtohelpworkersfigureoutexactlywheretheystandinallthis,theresearcherscreatedthisreallyusefultoolcalledtheAIImpactMatrix.Right.It'safourquadrantmap.ItplotsaskillsAIcapability,thesafe-eyescoreagainstitsrealworldAIexposurefromthatanthropicdata.Let'swalkeveryonethroughthissotheycanfigureoutwheretheysit.What'squadrantone?Watchingeyeishighdisplacementrisk.TheseareroleswithhighAIcapabilityandhighrealworldexposure.Likejuniorprogramming.Exactly.TheAIscoreda71.8inprogrammingandtechindustryexposureismassive.So,fordevelopersinthisquadrant,justknowingbasicsyntaxisfunctionallyobsolete.BecausetheAIisalreadyamasterofsyntax.Right.Tosurvivequadrantone,developershavetoleveluptosystemarchitectureandcross-functionaldesign.Theyneedtounderstandbusinesslogic,notjustcodecompilation.Okay.Whataboutquadranttwo?That'stheAIaugmentedzone.Thisisourcapabilitydemandinversionarea.So,highrealworldexposure,butlowinherentAIcapability.Exactly.Thisisyourfinancialanalysts,educators,customerserviceroles.Thecorevaluereliesoncomplexjudgmentandempathy.So,theAIactsasahighpoweredco-pilotforthedata,butitcan'treplicatethehumantohumantransactions.Spoton.Thenwemovetoquadrantthree,whichistheupskillingwindow.Thisoneisfascinatingbecauseittotallyflipstheconventionalwisdomaroundbluecollarwork.Itreallydoes.ThisquadrantfeaturesmoderatetohighAIcapabilityonthetest,butcurrentlyhasverylowrealworldexposure.We'retalkingaboutphysicaltechnicaltraits,right?HVACtechnicians,manufacturingoperators,fieldmechanics.But,wait,thestudysaidAIonlydoestext-basedtasks.Right.HowcanAIhavehighcapabilityforanHVACtechnician?IstheAIgoingtotextmybrokenairconditioner?That'stheperfectquestionandit'sagreatnuance.Obviously,alanguagemodelisn'tphysicallyturningawrench,butthehighcapabilityscoreprovesthatthesephysicaltraitspossessamassivelayerofcognitiveanddiagnosticwork.Oh,Isee.So,thephysicalexecutionandthecognitivediagnosisarebeingdecoupled.Exactly.Ifamechanicfeedssensordata,pressurereadings,andaudiodiagnosticsintoafrontiermodel,theAIcancross-referencethatagainsteveryservicemanualever-publishedandgiveyoutheexactrepairstepsinstantly.So,rightnow,justthesheerphysicalpresencerequiredtoaccessthemachineryiswhatinsulatestheseworkers.TheyhavenegativeAIexposure.Currently,yes.Butasdiagnosticsensorplatformsbridgethegapbetweenthephysicalworldandthedigitalnetwork,thatinsulationwillevaporate.Whichiswhytheycallittheupskillingwindow.Thereisanarrowtimeframerightnowforphysicaltraitspeopletoadaptbeforethetransitionaccelerates.Exactly.ThemechanicwholearnstointegrateAIdiagnosticsintotheirworkflowwillmultiplytheirvaluemassively.Makessense.Okay.Finally,quadrantfourth.Thisislowerdisplacementrisk.Right.LowAIcapabilityandlowreal-worldexposure.Wearetalkingaboutjobsrequiringdeeplyembodiedhumanjudgmentandhighstakes,physicaloremotionalintervention.Specializedsurgeons,physicaltherapists,triallawyersinacourtroom.Exactly.Thevaluehereisintrinsicallytiedtohuman-to-humanstakes.Atriallawyerisn'tjustreadingcaselaw.They'rereadingthejury.Right.They'refeelingtheemotionaltemperatureoftheroomandLLM'slackphysicalembodimentandemotionalstatesotheyjustcan'texecutetheseroles.Andtheworkforceisn'tevenattemptingtoexposetheserolestoautomation.Okay.Sowhatdoesthisallmean?Whenwelookatthesafeeyetesting,theanthropicdataandthismatrix,whatistheultimatetakeawayforyou,thelistener?ThemaintakeawayisthatthebinarynarrativeofwillAItakemyjobisjusttoosimple.AIisanexceptionalengineforruleboundreasoning.Itwillconsumestructureddataandsyntax.Butitismeasurablyweakattheconnectivetissueofthemoderneconomy.Exactly.Listening,reading,speaking,socialperception,thosemaintaintheirpremium.Inalmosteveryhighvalueworkflow,youthehumanareresponsibleforthelastmileofjudgment.Andremember,thisdatarepresentsasnapshotofearly2026atthe70billionparameterfrontier.Thewindowtoproactivelyprepareisopenrightnow.Itis.Youdon'tneedtofeartheAI.Youneedtolearntomanageit.InvestinhumanAIworkflowdesignandcriticalevaluation.Andyouknow,thisraisesanimportantquestionthattheresearchersbroughtupattheveryendofthestudy.Oh,aboutthegovernmentdefinitions.Yeah,we'vebeentalkingabouttheownattaxonomyof35skills,butthatlistwascreatedbeforeLLMsevenexisted.Sotheframeworkwe'reusingtomeasuretheworkforceisfundamentallyoutdated.Exactly.Theresearcherspointoutthatentirelynewskillsareemergingrightnow.Skillslikemulti-agentorchestration,whichislikesettingupmultipleAImodelswithdifferentrolestomanageacomplexprojectwithoutahumanhandholdingeverystep.Right.Orpromptengineeringthepreciselinguisticframingtogetthebestreasoningoutofamodel.Thesearevitalskillsfortheaugmentedworkflow,buttheyaren'tevenontheofficiallaborlistyet.Theyliterallydon'thaveanofficialname.Andthatistheultimatechallengeforthelistenerrightnow.Ifthemostvitalcareerskillyou'llneedinfiveyearsdoesn'tevenhaveanametoday.Howwillyoustartpracticingittomorrow?Youcan'twaitforacorporatetrainingprogram.Youhavetoactivelybuildthatintuitionyourself.Exactly.Thinkaboutthatmathematicianwhocanwritecodebutcan'treadaroom.Themachineprovidesthecalculation.Youprovidethecomprehension.Lookatyourowndailyworkflowandaskyourselfwheredoesthecalculationendandwheredoesyourlastmileofhumanjudgmentbegin?