• Nie Znaleziono Wyników

Gender-Based Homophily in Research: A Large-Scale Study of Man-Woman Collaboration

N/A
N/A
Protected

Academic year: 2021

Share "Gender-Based Homophily in Research: A Large-Scale Study of Man-Woman Collaboration"

Copied!
26
0
0

Pełen tekst

(1)

ContentslistsavailableatScienceDirect

Journal

of

Informetrics

journalhomepage:www.elsevier.com/locate/joi

Gender-based

homophily

in

research:

A

large-scale

study

of

man-woman

collaboration

Marek Kwiek

a

, Wojciech Roszka

b

a Institute for Advanced Studies in Social Sciences and Humanities (IAS), UNESCO Chair in Institutional Research and Higher Education Policy, Adam Mickiewicz University in Poznan, Poland

b Poznan University of Economics and Business, Poznan, Poland

a

r

t

i

c

l

e

i

n

f

o

Keywords: Research collaboration co-authorships gender gap sociology of science homophily scientific careers publishing patterns probabilistic record linkage sex differences

a

b

s

t

r

a

c

t

Weexaminedthemale-femalecollaborationpracticesofallinternationallyvisiblePolish uni-versityprofessors(N=25,463)basedontheirScopus-indexedpublicationsfrom2009–2018 (158,743journalarticles).Wemergedanationalregistryof99,935scientists(withfull admin-istrativeandbiographicaldata)withtheScopuspublicationdatabase,usingprobabilisticand deterministicrecordlinkage.Ouruniquebiographical,administrative,publication,andcitation database(“ThePolishScienceObservatory”)includedallprofessorswithatleastadoctoral de-greeemployedin85research-involveduniversities.Wedeterminedwhatweterman“individual publicationportfolio” foreveryprofessor,andweexaminedtherespectiveimpactsofbiological age,academicposition,academicdiscipline,averagejournalprestige,andtypeofinstitutionon thesame-sexcollaborationratio.Thegenderhomophilyprinciple(publishingpredominantlywith scientistsofthesamesex)wasfoundtoapplytomalescientists—butnottofemales.Themajority ofmalescientistscollaboratesolelywithmales;mostfemalescientists,incontrast,donot collab-oratewithfemalesatall.Acrossallagegroupsstudied,all-femalecollaborationismarginal,while all-malecollaborationispervasive.Genderhomophilyinresearch-intensiveinstitutionsproved strongerformalesthanforfemales.Finally,weusedamulti-dimensionalfractionallogit regres-sionmodeltoestimatetheimpactofgenderandotherindividual-levelandinstitutional-level independentvariablesongenderhomophilyinresearchcollaboration.

1. Introduction

Scienceisacollaborativeenterprise,with(maleandfemale)scientistscollaboratinginternationally,nationally,andinstitutionally (Wuchty,Jones,&Uzzi,2007;Wagner,2018).However,thisisnotourtopic:ourfocusisonmale–male,female–female,andmale– female(ormixed-sex)researchcollaborationratherthancollaborationacrosscountriesandinstitutions.Thedominatingviewin liter-atureisthat,onaverage,malescollaboratemoreoftenwithmales,andfemalescollaboratemoreoftenwithfemales(Jadidi,Karimi, Lietz,&Wagner,2018;Lerchenmueller,Hoisl,&Schmallenbach,2019;Wang,Lee,West,Bergstrom,&Erosheva,2019;Holman& Morandin,2019;Boschini&Sjögren,2007;McDowell&Smith,1992).Thishypothesisisbeingtestedusingalarge-scaledatasetwith uniquevariables.

Accordingtothehomophilyprinciple,“similaritybreedsconnection”;consequently,personalnetworksarehomogeneouswith regardtomanysociodemographicandpersonalcharacteristics(suchasage,ethnicorigin,classorigin,wealth,education,andgender). Onthepositiveside, homophilyisreported tosimplifycommunication (McPherson,Smith-Lovin,&Cook, 2001;Kegen,2013). However,onthenegativeside,homophilymay“limitpeople’ssocialworldsinawaythathaspowerfulimplicationsfortheinformation

E-mailaddresses: kwiekm@amu.edu.pl(M.Kwiek),wojciech.roszka@ue.poznan.pl(W.Roszka). https://doi.org/10.1016/j.joi.2021.101171

Received6June2020;Receivedinrevisedform29April2021;Accepted5May2021

1751-1577/© 2021TheAuthors.PublishedbyElsevierLtd.ThisisanopenaccessarticleundertheCCBYlicense (http://creativecommons.org/licenses/by/4.0/)

(2)

theyreceive,theattitudestheyform,andtheinteractions theyexperience” (McPherson etal.,2001).Asscienceisincreasingly collaborative,thehomophilyprinciplemayincreasinglyinfluenceacademiccareers.Researchcollaborationinscience(orgender co-authorshippatterns)providesfertilegroundtotestthehomophilyprinciple.

Man–womanresearchcollaborationpatternsinsciencearecontrastedinthispaperthroughsixlenses:biologicalage,academic position,academicdiscipline,gender-definedresearchcollaborationtype,journalprestige,andinstitutionalresearchintensity.The individualscientist,ratherthantheindividualarticle,istheunitofanalysis.Thekeyinnovativemethodologicalstepisthe determi-nationofwhatweterman“individualpublicationportfolio” (forthedecadeof2009–2018)foreveryinternationallyvisiblePolish scientist(N=25,463universityprofessorsfrom85universities,groupedinto27disciplines,alongwiththeir164,908international collaborators,whotogetherauthored158,743Scopus-indexedpublications).Co-authorshipsareusedfortheoperationalizationof researchcollaboration,followingstandardbibliometricpractice.

Theindividualpublicationportfolioreflectsthedistributionofgender-definedresearchcollaborationtypes(same-sexcollaboration andmixed-sexcollaboration)foreveryindividualscientist.Teamformationinacademia,understoodaspublishingwithcoauthors ofvaryingnumbersanddifferentgenders,isvoluntary(McDowell&Smith,1992):researchersteamupwhentheythinkthatthey arebetteroff collaboratingthanpublishingalone.Theteamsformed,orthearticlespublished,arelikelytoreflect“individualtastes andperceptionsofthereturnstocollaboration,aswellasthecostsofcoordination” (Boschini&Sjögren,2007,p.327).Somemale scientistscollaboratepredominantlywithothermales,andsomefemalescientistscollaboratepredominantlywithotherfemales.Still, othersprefertopublishinmixed-sexcollaborations(ortoauthorindividually).Weexaminethesame-sexcollaborationratioatan individuallevelofeveryinternationallyvisiblePolishscientist(i.e.,onlyauthorswithScopus-indexedpublications)andgeneralize theresultsfromtheindividualleveltothelevelofthenationalhighereducationsystem.

2. Literaturereview

2.1. Thegendercontextofscience

Thegendercontextofacademicsciencehaschangedsubstantiallyinthepastfewdecades(Huang,Gates,Sinatra,&Barabàsi,2020; Larivière,Ni,Gingras,Cronin,&Sugimoto,2013),withmorefemalescientistsenteringthehighereducationsector(Elsevier,2018) andoccupyinghighacademicpositions(Zippel,2017;Diezmann&Grieshaber,2019).Maleandfemalescientistsoftenpursuedor werepushedontosomewhatdifferentcareertracksandwerelocatedindifferentacademicstructures,with“differentialaccessto valuableresources” (Xie&Shauman,2003,p.193).Females,asnewentrantsintoatraditionallymale-dominatedacademicprofession, initiallydidnothaveequalaccesstoprofessionalnetworks(McDowell,Singell,&Stater,2006).Buttheacademicworldischanging. Newbibliometric literaturesapplying the variousgender-determination methodsto authorsandauthorships (Halevi, 2019; Elsevier,2020)bringnewdata-driveninsightstogenderdisparitiesinscience, andliteratureshavebecomemuchlessbasedon anecdotalandlocalizedstudies(Larivièreetal.,2013).Womenarepluggingintonetworksovertimeastheprofessionbecomesmore genderrepresentative(asshownforacademiceconomistsbyMcDowelletal.,2006,p.154).However,somewhatparadoxically,the increasedparticipationofwomeninSTEMdisciplinesisreportedtohavebeenaccompaniedbyanincreaseingenderdifferences regardingbothproductivityandimpact(Huangetal.,2020,p.8;Elsevier,2018,p.16).

Asrecentliteraturehighlights,femalescientistsoccupymorejuniorpositionsandreceivelowersalaries,aremoreoftenin non-tenure-trackandteaching-onlypositions,arepromotedmoreslowly,arelesslikelytobe listedaseitherfirstorlast authorona paper,andareallocatedlessresearchfundingfromnationalresearchcouncils.Womenalsotendtobelessinvolvedininternational collaboration;femalecollaborationsaremoredomesticallyorientedthanarethecollaborationsofmalesfromthesamecountry; andfemales haveless-prestigiouscollaborationsandfewercollaborationsoverall(seeHolman& Morandin,2019;Halevi,2019; Larivièreetal.,2013;Larivièreetal.,2011;Aksnes,Rørstad,Piro,&Sivertsen,2011;Aksnes,Piro,&Rørstad,2019;Huangetal., 2020;Maddi,Larivière,&Gingras,2019;Fell&König,2016;vandenBesselaar&Sandström,2016;Nielsen,2016).Ineverycountry studiedrecentlyinElsevier(2020) andElsevier(2018),thepercentageofwomenwhopublishinternationallyis lowerthanthe percentageofmenwhodoso;forPoland,whichisnotincludedintheElsevierreports,thesepublishingpatternsareconfirmedfor variouscollaborationintensitylevelsandforvariousagegroups;seeKwiek&Roszka,2020,ongenderdisparitiesininternational collaboration).

FemalescientistsinPolandconstituteasubstantial,highlyproductive,andhighlyinternationalizedpartoftheacademic work-force,whichisoftenthecaseinformerlycommunistEuropeancountries,whichexhibitgreatergenderparitythantheworldand theOECDaverages (Larivièreet al., 2013, p.212). Polandhasa higher proportionof professors thanany country studiedin Larivièreetal.(2013) orinDiezmannandGrieshaber(2019),reaching29.82%in2018(GUS,2019,p.220),eventhoughthere isaclear“thehigherthefewer” patternacrossallinstitutionaltypes.Aswillbehighlightedbytheresultsofthispaper,inPoland, malestendtocollaboratewithmales—butfemalestendnottocollaboratewithfemales.Thus,genderhomophilyishighamongPolish malesandlowamongfemales,thelatterconstituting41.5%ofPolishuniversityprofessors(ofallranks,allwithatleastadoctoral degree)inoursampleand46.10%oftheentirefull-timeacademicworkforcein2018(GUS,2019,p.220).

Arecentcohortanalysisof theeffectsof genderonthepublicationpatternsin mathematics(Mihaljević-Brandt, Santamaria, &Tullney,2016, pp.8–13),oneofthemostheavilymale-dominatedacademicfields,basedonthescholarlyoutputof150,000 mathematicians,showsthatwomenpublishlessatthebeginningoftheircareers;theyleaveacademiaatahigherratethanmen; andhigh-rankedmathematicsjournalspublishfewerarticlesauthoredbywomen.Womenmayalsosufferfrom“biasedattention” totheirwork,eveniftheirworkisofcomparablequality(Lerchenmuelleretal.,2019,p.10).Theauthors’genderisalsoreported toaffectthecitationsreceived(Potthoff &Zimmermann,2017;Maddietal.,2019).Femaleleadauthorsarereportedtoreceiveup

(3)

to29%fewercitationsforworkpublishedinthemostinfluentialjournals(asshownforpublicationsfromthePubMeddatabaseof 3,233recipientsofprestigiousfellowshipsinlifesciencesintheU.S.:Lerchenmuelleretal.,2019,p.4).

Furthermore,gender-basedhomophilyincitationsexistsinalldisciplines,asastudyofthecitationdataofsevenmillionarticles publishedin2008–2016shows:theciterdisproportionatelycitesreferencesfromauthorswhoareofthesamegender,malescientists disproportionatelycitingothermalescientists,possiblyleadingtoa“perpetualdisparity” incitationsinfavorofmenasmenrepresent about70%ofallauthorships(Ghiasi,Mongeon,Sugimoto,&Larivière,2018,p.1520).Moreover,recentresearchbasedonasampleof CVsofU.S.economistsreportsthatgenderinfluencestheattributionofcreditforgroupwork,thatis,co-authorshipmattersdifferently fortenureformenandwomen,withwomenbeinglesslikelytoreceivetenurethemoretheyco-author(Sarsons,Gërxhani,Reuben,& Schram,2021).Thisdifferentialattributionofcreditcontributestothegenderpromotiongap(Fell&König,2016;Abramo,D’Angelo, &Rosati,2015).Furthermore,thegendercitationgappersists:eventhoughfemalescientistsmaypublishmoreinjournalswithhigher impactfactorsthantheirmalepeers,theirworkmayreceivelowerrecognition(fewercitations)fromthescientificcommunity(as Ghiasi,Larivière,& Sugimoto,2015,haveshownforfemaleengineers,usingasampleof680,000articlesfrom2008–2013,and Maliniaketal.,2013,fortopjournalsininternationalrelations).

2.2. Femalescientistsandcompetition

Ofthevariousapproachestostudyingthe“increasingandpersistent” gendergap(Huangetal.,2020,p.3)and“pervasive” gender hierarchies(Fox,2020,p.1001)inscience,anapproachcenteredoncompetitionisespeciallyrelevantinthecontextofhomophilous andheterophilouscollaborationpatterns.Therehavebeenongoingdiscussionsinexperimentalandpersonneleconomics(oftenwith laboratory-basedevidence)aboutwhetherwomenaredeterredbycompetitioninsomeareasofscience(andinsomeworkplaces moregenerally;Floryetal.,2014;Dargnies,2012).Thesystematicshyingawayfromcompetitioncouldhaveimplicationsnotonly forthegenderdistributionoffemalesacrossacademicdisciplinesandtheirsub-disciplinesbutalsoforteamformationinresearch collaboration,selectedprestigelevelofjournalsinacademicpublishing,andauthorshipcomposition.Laboratoryexperimentsshow thatwomenmayshyawayfromcompetition andmenmayembraceit,withgenderimplicationsforpublishingintopacademic journals,wherecompetitionisstiff andtheriskofrejectionhigh(Sonnert&Holton,1996;Kwiek,2021).Womenareextremely un-derrepresentedintopjournalsinsomedisciplines,suchasmathematics(Mihaljević-Brandtetal.,2016,p.19),andtheycanself-select intolower-rankedjournals(Mayer&Rathmann,2018).Genderdifferencesinthepropensitytochoosecompetitiveenvironments(in ourcase,highlyselectivejournals)arereportedtobedrivenbygenderdifferencesinconfidenceandpreferencesforenteringand performinginacompetition(Niederle&Vesterlund,2007,pp.1098–1100).Intheirstudyofallfullprofessorsinpsychologyin Germany,Mayer&Rathmann(2018,pp.1674-1676)showthatintopjournalpublications,thereareconsiderablymoremenwith ahighpublicationoutput,aswellasconsiderablylessmenwithalowpublicationoutput.Genderdifferencesinchoicesover com-petitionmaybedrivenpartlybymenpreferringcompetitivetonon-competitivesettingsandbyasignificantlystrongeraversionto competitiveworkplacesamongwomencomparedtomen(Floryetal.,2014).Notsurprisingly,malescientistsover-cite(Kingetal., 2017;Maliniaketal.,2013),arebetterrepresentedintopjournals,andhavehighervisibilityinscience(Maddietal.,2019).

Academicnormsorexpectationsofconventionalbehaviormayalsomatter:theremaybeacommonsocialpractice,particularly inmale-dominateddisciplinesofscience,that“holdswomenuptomorescrutinythanmen” (Gupta,Poulsen,&Villeval,2013,p.16). SonnertandHolton(1996,p.69),intheirstudyofgenderdisparitiesincareerpatternsofespeciallypromisingscientists,conclude thatwomenmightbeseenassocializedtobelesscompetitive“sothattheychoosetheirownnicheratherthanenterthefraywith numerouscompetitors workingon thesametopic,” oftenfeelingtheyare“underthemagnifyingglass.” Malescientistsmaybe “moreaggressive,combativeandself-promotingintheirpursuitofcareersuccess,andsotheyachievehighervisibility” (Sonnert& Holton,1996,p.67).Socialnormsmaythusinfluencepublishingpatterns,including,forinstance,predominantlysame-sexpublishing formalescientists—especiallyinmoretraditionalsocietiessuchasPoland.

Atthesametime,inmorefirmlymale-dominateddisciplines(suchasphysicsandastronomy,engineering,andcomputerscience, inthePolishcase),femalescientistsmayfeelmoreintenseperformancepressureduetotheirhighvisibilityamongtheoverwhelming majorityofmalescientistsandcarryingtheburdenofrepresentingwomeninthesedisciplines.Theymayhavetowork“twiceas hardtoprovetheircompetence,” withalltheiractionsbeingpublic,asKanter(1977,p.973)suggestedinherclassicstudyoftherole ofmale-femaleproportionsinworkplacesettings.Beinglesscompetitivelyinclinedinanincreasinglycompetitiveenvironmentof globalsciencemayhurtfemalescientists,especiallyintheirearlycareers,atanindividuallevelofobtainingtenure,salaryincreases, andresearchfunding(VandenBesselaar&Sandström,2015;Sarsonsetal.,2021;Kwiek,2018a).InPolishacademia,thelistof disciplineswherefemaleparticipationisapproximatelyorexceeds50%goesbeyondthesocialsciencesandhumanities(toinclude alsobusiness,economics,andeconometrics;agriculturalandbiologicalsciences;medicine;chemistry;biochemistry,geneticsand molecularbiology;andpsychology;seeTable16 inDataAppendices).Outofthe24ASJCScopusdisciplinesstudiedinthispaper, femalerepresentationreachesatleast50%in10ofthem.

2.3. Genderhomophilyinresearchcollaborationdefined

Theliteratureinvestigatinggenderhomophilyinacademicpublishingisbasedbothonresearchonselectedinstitutions(e.g., McDowell&Smith,1992),selecteddisciplines(predominantlyeconomics,asinBoschini&Sjörgen,2007,orMcDowell,Singell,& Stater,2006),andlarge-scalebibliometricdata(seeWangetal.,2019,whoexamined252,413paperswith807,588authorshipsfrom theJSTORcorpus,orGhiasietal.,2015,whostudiedapproximatelyonemillionWebofScienceauthorshipsinengineering).

(4)

Mostrecentbibliometricstudiesongenderdifferencesinresearchcollaborationpatternssuggestthatmentendtoco-authorwith menandwomenwithwomen—leadingtotheresearchthemeof“genderhomophily” inscience(Ghiasietal.,2018;Potthoff & Zimmermann,2017;Lerchenmuelleretal.,2019;Kegen,2013;Wangetal.,2019;Boschini&Sjögren,2007).Atthesametime, however,collaborationinresearch,traditionallyoperationalizedasco-authoredpublications,influencescareerprogress.Excessive genderhomophilyamongwomen,whilesupportiveforearly-careerfemaleresearchers,mayalsoharmtheircareers.Thisisespecially relevantforparticularlyablefemalescientistspublishinginhigh-impactjournals(asLerchenmuelleretal.,2019,showwithpowerful empiricalevidence). Women mayplacethemselves ata disadvantagewhen collaboratingdisproportionatelywithotherwomen because,forexample,“womentendtobepartoflessresource-richandinfluentialnetworksorbecausewomen’sworkmayreceive lessattentionthanmen’s,likelyharmingcareerprogress” (Lerchenmuelleretal.,2019,p.3).ThisisnotthecaseinPoland,though, asweshalldemonstrate,sincethePolishfemalescientistsstudiedtendtoavoidpublishingexclusivelywithotherfemalescientists atalllevelsoftheircareersandforallagegroups.

Asmentioned,thehomophilyprinciplemaintainsthat“similaritybreedsconnection” andpersonalnetworksarehomogeneous withregardtosociodemographic,behavioral,andintrapersonalcharacteristics.Homophilyisknownto“limitpeople’ssocialworlds” (McPherson,Smith-Lovin,&Cook,2001,p.415).Accordingtothisprinciple,contactbetweensimilarpeopleoccursatahigherrate thanamongdissimilarpeople;inotherwords,“birdsofafeatherflocktogether” (McPhersonetal.,2001,p.417).Thus,malesshould co-authorwithmalesinadisproportionatefashion,whilefemalesshouldco-authordisproportionatelywithfemales,acrosscountries, disciplines,andinstitutions.

Homophily,ingeneral,(includingthegender-basedhomophilyexaminedinthisresearch)isreportedtosimplify communica-tion,enhancethepredictabilityofbehavior,entailreciprocity incollaboration,andincreasetrust betweencollaboratingparties (McPhersonetal.,2001,p.435;Kegen,2013,p.63).AsKegen(2013,p.65)notes,whilethebehaviorofcollaboratorsmightbe morepredictableandcollaborationpotentiallylesscostly,genderhomophilymightalsoexcludewomenfrominformalnetworks. Furthermore,embeddednessinacademicsocialnetworks—especiallyinformalnetworks—iscrucialbothfordoingresearchandfor achievingacareer.“Networksmatter.Producinghigh-qualityworkisnotsufficientforresearchtogaintheattentionofthewidest numberofscholarsorhavethegreatestimpact” (Maliniaketal.,2013,p.918).

Ifhomophilymeans“thetendencyofpeopletochoosetointeractwithsimilarothers” (McPhersonetal.,2001,p.435),then gender-basedhomophilyinthisresearchmeansPolishmalescientistsdisproportionatelyco-authoringwithothermalescientists,and Polishfemalescientistsco-authoringdisproportionatelywithotherfemalescientists.Recentresearchtendstoindicatethatfemale scientistsexhibitstrongergenderhomophilythanmalescientists(Jadidietal.,2018):femalesarereportedtocollaboratemoreoften withfemalesthanmaleswithmales(Kegen,2013;Lerchenmuelleretal.,2019;Ghiasietal.,2018).Evidencefromco-authorship patternsineconomicsindicatesthatteamformationinacademicpublishingisnotgender-neutral:rather,thereispowerfulgender sortinginteamformation(Boschini&Sjögren,2007).However,thepracticesofcollaborationbetweenmalesandfemalesdiffer acrossdisciplines(Maddietal.,2019);thepatternsofinternationalresearchcollaborationdiffercross-nationally(seeKwiek,2020a, on28Europeancountries)andbetweengendersintra-nationally(seeKwiek,2020b,andKwiek&Roszka,2020,onPoland). 2.4. Hypothesesofthisresearch

Followingacomprehensiveliteraturereviewandbasedonpriorin-depthknowledgeofthePolishacademicsciencesystem,we haveformulatedthefollowingsevenresearchquestionsleadingtosevenhypotheses(whicharepresentedinTable1,alongwiththe resultsofourresearch):

ThePolishscienceandhighereducationsystemshavebeenstudiedintensively.Forinstance,Kulczyckiandcolleaguesexamined thefundingsystem(Kulczycki,Korzeń,&Korytkowski,2017),BielińskiandTomczyńska(2018) studiedthevariousmanifestations oftheethosof scienceandshowedhowPolandis movingawayfromMichaelPolanyi’s“republicof science”.Feldyand Kowal-czyk(2020) studiedhowscientistsviewthesystemoffinancingscience,andKulczyckiandKorytkowski(2020) examinedchanging publicationpatternsinPoland.Furthermore,highereducationreforms(e.g.,Shaw,2019;Antonowicz,Kulczycki,&Budzanowska, 2020;Kwiek,2012),internationalresearchcollaboration(Kwiek,2020b),andhighresearchproductivity(Kwiek,2018b)havebeen examined.GenderdisparityinPolishscience,however,hasrarelybeenstudied,andgendercollaborationpatterns,includinggender homophily,havenotbeenexaminedexceptbyKwiekandRoszka(2020),whostudiedinternationalresearchcollaborationby gen-derandshowedthatmalescientistsdominateinthiscollaborationtypeateachlevelofintensity,withsignificantcross-disciplinary differences(Nielsen,2016,cametosimilarconclusionsinhisstudyofaDanishuniversity).Siemienska(2007) examinedgender researchproductivitygapreferringtoculturalcapitaloffacultymembers.Finally,Kosmulski(2015) analyzedtheproductivityand impactofmaleandfemalescientistsintheperiod1975–2014,basedonalimitedsetofauthorsbearingoneofthe26mostpopular “–ski” or“–cki” names,showingthatmalescientistsgenerallyhavehigherproductivityandimpactthanfemalescientists,exceptfor inbiochemistry,wheretheirproductivityandimpactarealmostequal.

3. Dataandmethods

3.1. Dataset

Twolargedatabasesofdifferentnaturesweremerged:DatabaseIwasanofficialnationaladministrativeandbiographical reg-isterofallPolishacademicscientists;DatabaseIIwastheScopusdatabase.Thetwoweremergedtocreate“The PolishScience Observatory,” whichwasmaintainedandperiodicallyupdatedbythetwoauthors(ashortdescriptionofthedatabaseispresented

(5)

Table1

Researchhypothesesandresults(summary).

Research Question Hypothesis Result

RQ1. What is the relationship between gender and same-sex collaboration?

Hypothesis 1. We would expect that the same-sex collaboration ratio is higher for female than for male scientists.

Not confirmed RQ2. What is the relationship between gender,

same-sex collaboration, and age? Hypothesis 2. We would expect that the same-sex collaboration ratio decreases with age for both male and female scientists.

Confirmed for male scientists only RQ3. What is the relationship between gender,

same-sex collaboration, and academic position?

Hypothesis 3. We would anticipate that the same-sex collaboration ratio decreases with academic position for both male and female scientists.

Confirmed for male scientists only RQ4. What is the relationship between gender,

same-sex collaboration, and academic disciplines? Hypothesis 4. We would anticipate that the same-sex collaboration ratio is higher in male-dominated academic disciplines.

Confirmed RQ5. What is the relationship between gender,

same-sex collaboration, and institutional research intensity?

Hypothesis 5. We would expect that the same-sex

collaboration ratio is higher in research-intensive universities.

Confirmed for male scientists only RQ6. What is the relationship between gender,

gender-defined research collaboration type, and journal prestige?

Hypothesis 6. We would expect that the journal prestige level of mixed-sex publications is higher than that of same-sex publications for both male and female scientists.

Confirmed RQ7. What is the impact of gender and other

individual-level and institutional-level independent variables on gender homophily in research collaboration?

Hypothesis 7. In a fractional logit regression model, we would anticipate that individual-level independent variables are more influential than institutional-level independent variables in predicting the same-sex collaboration ratio.

Not confirmed

Table2

Anexampleofprobabilisticintegrationoutput(identical,similar,anddisparatepairsofstrings). Last name, Database II First name, Database II Last name, Database I First name, Database I Last name compliance First name compliance Posterior probability

Kwiek Marek Kwiek Marek 2 2 0.9975556

Mrowiec Bozena Mrowiec Bo ż ena 2 1 0.9946168

Sobkow Agata Sobków Agata 1 2 0.9991700

Wltek Bozena Witek Bo ż ena 1 1 0.9073788

Mudry Z. Mudryk Zbigniew 2 0 0.8846165

inKwiekandRoszka,2020).Themainstepsinmergingthebiographicalandadministrativedataset(ThePolishScience)withthe publicationandcitationdatabase(Scopus)aregraphicallyshowninFigure1.

DatabaseI(createdbytheOPINationalResearchInstitute)comprised99,535scientistsemployedinthePolishsciencesectoras ofNovember21,2017.Onlyscientistswithatleastadoctoraldegree(70,272)andemployedinthehighereducationsectorwere selectedforfurtheranalysis(54,448or54.70%ofallscientists,allworkingat85universitiesofvarioustypes).Thedatausedwere bothdemographic(genderanddateofbirth)andprofessional(thehighestdegreeawarded;awarddateofPh.D.,habilitation,andfull professorship;andinstitutionalaffiliation),witheachscientistidentifiedbyauniqueID.DatabaseIIincluded169,775namesfrom 85institutionswhosepublicationsforthedecadeanalyzed(2009–2018)wereincludedinthedatabaseand384,736Scopus-indexed publications.AuthorsinDatabaseIIweredefinedbytheirinstitutionalaffiliations,Scopusdocuments,andindividualScopusIDs. Scopususesasophisticatedauthor-matchingalgorithmtopreciselyidentifypublicationsbythesameauthor;genderisnotcaptured inScopusAuthorProfiles(Elsevier,2020,p.119).Wedidnotreconstructthefullpublishingcareers(asinHuangetal.,2020)of Polishscientistsbutonlyforthelastdecade,whentheirScopuspublicationsincreasedmarkedly.

WehaveidentifiedauthorswiththeirdifferentindividualIDsinthetwodatabasesandprovidedthemwithanewIDinthenew “Observatory” database.Probabilisticmethodsofdataintegrationwereused(Fellegi&Sunter,1969;Herzog,Scheuren,&Winkler, 2007;Enamorado,Fifield,&Imai,2019).Separatelywithineachofthe85universities,thefirstnameandlastnamerecordsofeach recordinDatabaseIwerecomparedwitheachoftherecordsinDatabaseIIusingtheJaro-Winklerstringdistance(withvaluesfrom0 to1;seeJaro,1989;Winkler,1990).Pairsofstringswithadistancegreaterthan0.94wereconsideredidentical(signifiedby2)(see Table2).Pairswithadistancegreaterthan0.88butlessthan0.94wereconsideredsimilar(signifiedby1),whilethosewithadistance lessthan0.88wereconsidereddisparate(signifiedby0).Next,usinganexpectationmaximizationalgorithm(Enamoradoetal.,2019), theposteriorprobabilitythatagivenpairofrecordsbelongstothesameunitwasestimated.Iftheprobabilitywasgreaterthan0.85, thepairwasconsideredtobepartofthesameunit(assuggestedbyHarronetal.,2017).Thecomputationwasmadeusingthe fastLinkRpackage(version0.6.0).

Byemployingaprobabilisticapproachtothemergingofthedatasets,itwaspossibletoestimatetheuncertaintyoftheprocess andthusassessthequalityofthenewintegrateddatabasebycalculatingthepercentageofrecordsincorrectlyclassifiedasmatches (falsediscoveryrate,FDR)andasnon-matches(falsenegativerate,FNR).Deduplicationprocedureswereappliedtotheraw inte-gratedauthordatabaseas38,750recordsreferredto32,937uniqueauthors.Forduplicatedrecords,aclericalreviewwasperformed (Herzogetal.,2007).Manualverificationofduplicaterecordsrevealedthat1,207records(12.15%intermsofduplicatedrecords

(6)

Figure1.Databasemerging:Mainstepsinmergingthebiographicalandadministrativedataset(ThePolishScience)withthepublicationand citationdatabase(Scopus).

and3.11%ofallintegratedrecords)wereincorrectlyassignedtothesameperson.Theserecordsweredeletedfromtheintegrated database.Anintegrated databaseusedinourresearch finallyincluded32,937unique authorsofpublications,including25,463 authorsofjournalarticles.

Finally,DatabaseIIalsocontainedmetadataon384,736publicationspublishedin2009–2018.Fromamongthem,the377,886 papershadupto100authors,and230,007werewrittenbytheauthorsincludedinDatabaseI(weuseddeterministicrecordlinkage atthisstageofdataintegration).Subsequently,onlyarticleswritteninjournalswereselectedforfurtheranalysis,withthenumber ofpapersinthedatabasereducingto158,743articles.

3.1.1. Limitations

Ourresearchhassomelimitationsandpossiblebiases(e.g.,selectionbias)asaresultofthedatabaseconstructionprocedureswe employ:weselectonlyinternationallyvisibleauthors,thatis,authorswithScopus-indexedpublications.Theselectionofadifferent database(forinstance,WebofScienceorthePolishScientificBibliography—PBN),adifferentperiod(otherthan2009–2018),a differentpublicationformat(otherthanarticlesin journals),oradifferent language(otherthanEnglish)mightleadtodifferent results.

ThedateofreferenceforthedataderivedfromDatabaseI(“ThePolishScience”)wasNovember21,2017,andforthedataderived fromtheScopusdatabasewasthewholedecadestudied.Therearefivesimplifyingassumptions.(1)Thepaperexaminesadecadeof individualpublishingoutput.Whiletheactualpublishingperiodmay,infact,beshorterthanadecadeforyoungerscientists,itmay beonlythemostrecentdecadeofthelong-termpublishingactivitiesofolderscientists.(2)Journalpercentileranksasprovidedby Scopusaredeemedstablewithinthisdecade—eventhoughtheymayfluctuateovertheperiodstudied.(3)WeassumethatPolish scientistswerenotchanginginstitutions(between75research-involvedand10research-intensiveinstitutions)inthedecadestudied, asthemobilitywithinthePolishhighereducationsystemisverylow.(4)WeregardPolishscientistswhowereassistant,associate, andfullprofessorsonthedateofreferenceaskeepingthesepositionsforthewholedecadestudied,whilethesepositionsarethe highestranksachievedinthestudyperiod.(5)Weuseaninternationallyrecognizabletripartitedivisionofacademicpositionsinto assistant,associate,andfullprofessors,eventhoughinfactweusetwoPolishacademicdegrees(doctoratesandhabilitations)and aPolishacademictitle(professorship).Inthissense,our“academicpositions” areproxiesforPolish“academicdegreesandtitles”. However,allscientistsinoursamplehavetheirdoctoraldegrees(andthereforemustbeatleastassistantprofessors).Allscientists

(7)

withhabilitationdegreesreceivethepositionofassociateprofessorwithinthreeyears,andallfullprofessorshavetheirprofessorship titles.

Whilebiologicalage,academicposition,andemploymenttypeandinstitutionweredefinedasofNovember21,2017,thevariables derivedfromtheScopusdatabasewereconstructedtoshowmeanvaluesforthedecadeof2009–2018,inwhichtheymayhavediffered fromyeartoyear.Alimitationisthatthevaluesfor2017forsomevariablesandthemeanvaluesforthedecadeof2009–2018are lumpedtogether.Clearly,eventhebinaryclassificationofmale-dominateddisciplinesvalidfor2017mayhavebeendifferentinthe previousyears,especiallyfordisciplinesclosetothethresholdvalueof50%(forinstance,HUM,SOC,andECON,with49.8%,49.8%, and49.1%offemalescientists,respectively;seeTable16 inDataAppendices).

Thismeansthatlongitudinal studies(year byyear) andcohort studies(by consecutivecohortsofscientists) werenot possi-blebecauseofdatalimitations.Actually,thedominantdisciplinesascribedtoscientists,individualpublicationportfolios,gender compositionofdisciplines,andaveragepublicationprestigewereconstructedforthedecadeof2009–2018.Forinstance,asingle observationwasamalewhoin2017was60yearsoldandwasemployedfull-timeasanassociateprofessorinaresearch-intensive university.HewaspublishinginASJCPhysicsandAstronomy,andhisindividualScopus-indexedresearchoutputforthedecade studied(2009–2018)wasstrictlydefinedintermsofpublicationnumbers(20Scopus-indexedjournalarticles),thegender compo-sitionofhisco-authorships(60%all-male,35%mixed-sex,5%solo),andtheindividualaveragepublicationprestigeexpressedin percentileranks(Scopus85thpercentilerank).Whileacknowledgingthislimitation,wemuststressthatPolishscientistspublishfar

toolittleperyeartouseScopusdatafromasingleyear(e.g.,2017only).Weassumethatadecadeofpublishingprovidesagood overviewofindividualpublishingpatternscapturedwithinindividualpublicationportfolios.

3.2. Methods

Asinourpreviousworkongenderdisparitiesininternationalresearchcollaboration(KwiekandRoszka,2020),alsohereevery Polishscientistrepresentedinourintegrateddatabasewasascribedtooneof27ASJCdisciplinesatthetwo-digitlevel(following Abramo,Aksnes,&D’Angelo,2020,whodeterminedthedominatingWebofSciencesubjectcategoryforeachscientisttheystudied). Agivenpapercanhaveoneormultipledisciplinaryclassifications(seetheASJCdisciplinecodesused,asdescribedinTable3,which presentsthevariablesusedinthisanalysis).Thedominantdisciplineforeachscientististhemodeforeachofthem:themostfrequently occurringvalue(whennosinglemodeoccurred,thedominantdisciplinewasrandomlyselected).AllPolishscientistsweredefined bytheirgender,discipline,aswellastheirpublications(solo,all-male,all-femaleormixed-sex).EveryASJCdisciplinerepresents proportionsofmaleandfemalescientists.However,GEN,NEURO,andNURSdisciplinesdidnotmeetanarbitraryminimumthreshold of50scientistsperdisciplineandwereomittedfromfurtheranalysis.

Inthepresentresearch,inwhichtheunitofanalysiswasanindividualscientist,everyscientisthadsoloorcollaborativearticles. Collaborativearticlesincludesame-sexandmixed-sexarticles.Collaborativearticleswithauthorsincludedinourdatabasearedefined intermsofthegenderoftheauthors.OfthePolishscientistsincludedintheintegrateddatabaseof54,448scientists,100%hadtheir genderdefinedintheoriginaladministrativedatabase.Incontrast,therearePolishco-authorsoutsideofourdatabase(e.g.,affiliated withthePolishAcademyofSciences)andinternationalco-authorsofpublicationswithPolishco-authorswhosegenderisnotdefined. Regardinginternational collaboratorsof Polishauthorsandtheirgender,weanalyzed158,743articles withindividualEIDs (ScopusindividualpublicationIDs).Therewere15,149articles(9.54%)writtensolelybyfemalescientists,39,089(24.62%)written solelybymalescientists,78,419(49.40%)writteninmixedfemale-malecollaboration,and18,109(11.41%)solo-writtenarticles. Therewere7,979articles(5.03%)forwhichonlythegenderofPolishco-authorswasknown.

Forthepurposeofdeterminingthegenderoftheinternationalco-authors,weusedanotherdatasetatourdisposal:adatasetof 27.4millionpublicationspublishedinthesameperiodof2009–2018intheOECDareaandindexedinScopus.Our“OECD” dataset includesallmetadataaboutallpublicationsproducedinthestudyperiodin1,674research-activeinstitutionslocatedin40OECD economies(thethresholdweusedwas3,000Scopus-indexedarticlespublishedinthepast10years).Specifically,weusedasubset ofourOECDdatasetofauthors(with11,087,392individualScopusIDs).Inthenextstep,weusedtheRpackageofgenderizeRto estimatethegenderoftheOECDauthorsfromourOECDdataset(seeWais,2016,onthevariousgenderdeterminationmethods, includingviatheRpackage).

GenderizeRwaspreviouslyusedforgenderpredictioninTopazandSen(2016) forgenderrepresentationineditorialboardsin 435journalsinmathematicalsciences;FellandKönig(2016) studiedgenderdifferenceinco-authorshipsamong4,234 industrial-organizationalpsychologists;Huangetal.(2020)examinedgenderinequalityintheacademiccareersof7.9millionWebofScience authors.Finally,Wangetal.(2019)alsousedtheRpackagetostudygender-basedhomophilyinJSTORpublicationdata.Genderize.io providesacountofthenumberoftimesthatfirstnameappearsinthecorpusandcorrespondingprobabilitiesofgender(whichis eithermaleorfemale).Inordertoestablishoptimalvaluesof genderpredictionindicators,wecan manipulatethethresholdof probabilityandcountvalues.

UsingtheRpackage,thegenderof 7,640,123ourOECDauthors(individualScopusIDs)wasestimatedwithaprobabilityof greaterthanorequalto0.85.Withthedataatourdisposal,outof11,087,392authors,thegenderizeRalgorithmwasunableto estimatethegenderof2,521,150authors(22.74%),includingalargenumberofauthorsfromJapanandSouthKorea,withwhom Polandcollaboratesonlymarginally.Outof8,566,242authorswhosegenderthealgorithmestimated,in926,119(10.81%)ofcases, genderwasestimatedwithaprobability lowerthan0.85.Inthenextstep,usingindividualScopusIDs,the“ThePolishScience Observatory” andthe“OECD” datasetsweremergedtodeterminethegenderofinternationalcollaboratorsofPolishauthors.Outof 164,908internationalcollaborators,wewereabletodeterminethegenderof83,702(or50,75%).Ourreferencedatabasetoestimate

(8)

Table3

Variablesusedintheanalysis.

No. Variable Description Source

1. Biological age Numerical variable. Biological age as provided by the national registry of scientists

(N = 99,935). Age in full years as of 2017 is used. Observatory 2. Age group Categorical variable. Three major age groups are used: young (39 and younger;

N = 8,400), middle-aged (40–54; N = 11,014), and older (55 and older; N = 6,049)

scientists. Observatory

3. Gender Binary variable, as provided by the national registry of scientists (N = 99,935). No other

options are possible in the registry. Observatory

4. Academic position Categorical variable. Three Polish degrees used as proxies of academic positions: doctoral degree only (assistant professor; N = 14,271); habilitation degree (associate professor; N = 7,418); and professorship title (full professor; N = 3,774). All scientists without doctoral degrees and from outside of the higher education sector were removed from the analysis.

Observatory

5. Discipline Categorical variable. All scientists ascribed to one of 27 Scopus ASJC disciplines. Dominant disciplines were used (N = 25,463).

Scopus 6. STEM disciplines Categorical variable. STEM disciplines: AGRI agricultural and biological sciences; BIO

biochemistry, genetics, and molecular Biology; CHEMENG chemical engineering; CHEM chemistry; COMP computer science; DEC decision science; EARTH earth and planetary sciences; ENER energy; ENG engineering; ENVIR environmental science; GEN biochemistry, genetics, and molecular biology; IMMU immunology and microbiology; MATER materials science; MATH mathematics; NEURO neuroscience; NURS nursing; PHARM pharmacology, toxicology, and pharmaceutics; and PHYS physics and astronomy.

Scopus

7. Non-STEM disciplines

Categorical variable. Non-STEM disciplines: BUS business, management, and accounting; DENT dentistry, ECON economics, econometrics, and finance; HEALTH health

professions; HUM arts and humanities; MED medicine; PSYCH psychology; SOC social sciences; and VET veterinary.

Scopus

8. Male- and female-dominated disciplines

Binary variable. Male-dominated disciplines are those in which the percentage of male scientists exceeds or equals 50% (N = 12,786 scientists). Female-dominated disciplines are those in which the percentage of female scientists exceeds 50% (N = 12,677 scientists).

Observatory 9. Mean publication

prestige (percentile rank)

Categorical variable. Mean prestige represents the median prestige value for all publications written by a scientist in the study period of 2009–2018. For journals for which the Scopus database did not ascribe a percentile rank, we have ascribed the percentile rank of 0; Scopus ascribes percentiles to journals in the 25 th to 99 th percentile range, with the highest rank being the 99 th percentile.

Scopus

10. Same-sex collaboration ratio

Categorical variable. The percentage of same-sex collaboration articles (male-male, female-female) among all collaborative articles in an individual publication portfolio defined for 2009–2018.

Observatory 11. Mixed-sex

collaboration ratio

Categorical variable. The percentage of mixed-sex collaboration articles (male-female) among all collaborative articles in an individual publication portfolio defined for 2009–2018.

Observatory 12. Solo collaboration

ratio

Categorical variable. The percentage of single-authored articles among all articles in an

individual publication portfolio defined for 2009–2018. Observatory 13. Research-intensive

institution

Binary variable. The 10 institutions (from among 85 examined) are the IDUB (or “Excellence Initiative–Research University ”) institutions selected in 2019.

Ministry 14. Number of

employees

Numerical variable. The number of full-time employed scientists (in FTEs: full-time equivalents) as of December 31, 2018.

Ministry

thegenderofco-authorswasrestrictedto1,674research-intensiveOECDuniversities;consequently,wewerenotabletoestimate thegenderofcollaboratorsfromnon-research-intensiveuniversitiesintheOECDareaorfromnon-OECDuniversities.

Next,usinganindividualscientistastheunitofanalysis,wecalculatedtheproportionofsame-sexpublicationsamong collabora-tivearticleswithintheindividualpublicationportfolioofeveryPolishscientistinthesample.Thus,forallscientists,maleandfemale, withintheircollaborativearticlesonly,wedeterminedwhatwetermedthesame-sexcollaborationratio(formalescientists collabo-ratingonlywithmalescientists,theratiois1).Analogously,theratioof0isequivalenttoconductingnosame-sexcollaboration—the scientistcollaboratesonlywiththeothergender,i.e.,thereareonlymixed-sexpublicationsinthescientist’sindividualpublication portfolio).Theratiodoesnottakeintoaccountthedifferentavailabilityofmaleandfemalecolleagueswithineachdiscipline.The availability,orthegendercompositionofeachdiscipline,includesboththeirnumbersandtheirpercentages.Asthesamplesection (3.3)showsindetail(seeTable16 inDataAppendices),therearemorethan1,000femalescientistsinonlythreedisciplines(AGRI, BIO,andMED)andmorethan500insixdisciplines(thethreeaboveandCHEM,ENG,andENVIR),whereastherearemorethan 1,000malescientistsinthreedisciplines,andmorethan500in12disciplines.

Thegendercompositionofthe24disciplinesstudiedwouldbeaseriouslimitationoftheindependentvariableaslongaswe assumedthatscientistscollaborateonlywithintheirdisciplines.However,scientistsinourdatasetcollaboratebothwithindisciplines andacrossthem,whichcanbeseenfromthedisciplinarystatisticspertainingtoindividualpublicationportfoliosbyASJCdiscipline andtoauthorshipcombinationsbyASJCdisciplineforindividualpapers.Traditionally,especiallyinacademicprofessionsurveys (seeKwiek,2019),scientistswhoarestronglyembeddedintheirdisciplinesareidentified,forinstance,bycheckingthediscipline

(9)

Table4

Themedianofthesame-sex collab-orationratiobygender.

Same-sex collaboration Male 0.500 Female 0.153 Total 0.333 Z -44.291 p-value < 0.001

oftheirdoctoraldissertations;basedonourdataset,incontrast,weexaminescientistsandtheircollaborativepublicationstowhich disciplinesareascribedviaaScopusindexingsystem.Theavailabilityofmaleorfemalecolleagueswithinadisciplineseemstomatter muchlessincurrentsettings,giventheincreasinglarge-scalecross-disciplinarycollaboration(asevidencedbydifferingdominant ASJCdisciplinesascribedtocollaboratingscientists).Table3 providesashortdescriptionofvariables(“Observatory” meansThe PolishScienceObservatorydatabase).

3.3. Sample

Thecharacteristicsofthesample(N=25,463;14,886malesand10,577females,58.5%and41.5%)ispresentedinTable16 in DataAppendices:abouthalfofthescientistsaremiddle-aged(orinthe40-54agebracket(49.7%),andoverhalfofthemareassistant professors(56.0%).Columnpercentagesenabletheanalysisofthegenderdistributionbymajoragegroups,academicpositions,and disciplines(bytype:STEMandnon-STEM,female-dominatedandmale-dominated).Rowpercentagesenabletheanalysisofhowmale andfemalescientistsaredistributedaccordingtoagivenfeature.Abouthalfofthescientistsworkinfemale-dominateddisciplinesand aboutahalfinmale-dominateddisciplines(49.8%and50.2%);however,femalesinfemale-dominateddisciplinesaretheweaker majority(54.6%)thanmalesinmale-dominateddisciplines(71.4%).Allassistantprofessors holddoctoraldegrees,all associate professorsholdhabilitations,andallfullprofessorsholdprofessorshiptitles.

The25,463scientistsinourintegrateddatabasehadatleastasinglearticleintheScopusdatabaseintheperiod2009–2018; therefore,itincludesallinternationallyvisiblePolishacademicscientists(ontheskeweddistributionofresearchproductivityof Polishscientists,see Kwiek,2018b;on theupper10%,termedtopperformers,whoproduceabouthalfofallpublicationsacross 11Europeansystems,seeKwiek,2016).Additionally,oursampleincludestheinternationalcollaboratorsofPolishauthors,whose genderwasdeterminedusingthealgorithmdescribedintheDataandMethodssubsection(164,908internationalco-authors).The differentiatedproportionsoffemalescientistscanalsobeexaminedbyacademicdiscipline.Whilefemalescientistsareespecially underrepresentedinthefourdisciplinesofcomputerscience(COMP16.5%),engineering(ENG14.9%),physicsandastronomy(PHYS 16.6%),andmathematics(MATHS25.2%),thenumberofmaleandfemalescientistsisalmostequalinartsandhumanities(HUM) andsocialsciences(SOC).

4. Results

4.1. Thesame-sexcollaborationratiobygender

Hypothesis1.Wewouldexpectthatthesame-sexcollaborationratioishigherforfemalethanformalescientists(notconfirmed). Genderhomophily inpublishing, orthesame-sex collaborationratio, fallswithin therangeof0(nosame-sex collaborative articlesamongcollaborativearticlesintheindividualpublicationportfolio)to1(exclusivelysame-sexcollaborativearticlesamong collaborativearticlesintheportfolio).Theaverageratioformalestobeinvolvedinsame-sexcollaborationismorethanthreetimes thatoffemales(themedianratioformalesis0.500,comparedwith0.153forfemales,Table4).Forthewholenationalsample,the medianratiois0.333,meaningthatatleast50%ofauthorsconductsame-sexcollaboration(maleswithmales,femaleswithfemales) atthe33.3%level.Mann-Whitney’sZ-testshowsthegenderdifferencetobesignificantlydifferentatthesignificancelevelof0.05. Thus,Hypothesis1isnotconfirmed.

4.2. Thesame-sexcollaborationratiobyageandacademicposition

Hypothesis2. Wewouldexpectthatthesame-sexcollaborationratiodecreaseswithageforbothmaleandfemalescientists (con-firmedformalesbutnotforfemales).

Hypothesis3. Wewouldanticipatethatthesame-sexcollaborationratiodecreaseswithacademicpositionforbothmaleandfemale scientists(confirmedformalesbutnotforfemales).

Beforeanalyzingtheeffectofageandacademicposition,weexaminedthelevelofcorrelationbetweenthesetwovariablessince, inmanyacademicsystems,seniorityisasignificantpredictorofcareeradvancement.TheboxplotsinFigure2 dividethedatainto

(10)

Figure2. AgedistributionintermsofacademicpositionandASJC.

Table5

Kruskal-Wallisteststatistics.

ASJC Kruskal-Wallis H df Asymp. Sig. N ASJC Kruskal-Wallis H df Asymp. Sig. N AGRI 1323.3 2 < 0.001 2,702 ENVIR 668.8 2 < 0.001 1,680 BIO 789.9 2 < 0.001 1,780 HEALTH 21.2 2 < 0.001 67 BUS 252.7 2 < 0.001 714 HUM 540.9 2 < 0.001 1,058 CHEM 744.3 2 < 0.001 1,475 IMMU 64.3 2 < 0.001 119 CHEMENG 215.2 2 < 0.001 481 MATER 691.8 2 < 0.001 1,462 COMP 370.6 2 < 0.001 1,030 MATH 580.5 2 < 0.001 1,026 DEC 31.2 2 < 0.001 54 MED 1498.1 2 < 0.001 3,574 DENT 33.4 2 < 0.001 75 PHARM 105.8 2 < 0.001 254 EARTH 540.6 2 < 0.001 1,154 PHYS 530.1 2 < 0.001 1,098 ECON 131.6 2 < 0.001 379 PSYCH 136.8 2 < 0.001 304 ENER 136.2 2 < 0.001 295 SOC 424.9 2 < 0.001 992 ENG 1440.9 2 < 0.001 3,358 VET 158.2 2 < 0.001 332

quartilesandshowthemedian,whichishigherforeachsubsequentacademicposition.Theboxesenclosethemiddle50%ofthedata (forinstance,acrossalldisciplines,halfoffullprofessorsareagedabout60).Outliersarelocatedpredominantlyabovetheboxes, showingthepresenceofolderscientistswithinthethreeacademicpositionsratherthanyoungerones.Thereisaclearinterdependence betweenageandacademicpositionastheaveragelevelofageincreaseswiththethreeconsecutiveacademicpositionsadoptedin thispaper(assistantprofessor,associateprofessor,andfullprofessor)acrossall24ASJCdisciplines.Also,theobservedaverageage foreachofthethreestagesofanacademiccareerissimilaramongallthedisciplines.Thisempiricalobservationisconfirmedby theformalKruskal-Wallistestinwhichwetestedthenullhypothesisthattheaverageageisthesameateverystageoftheacademic career:foreachdiscipline,werejectthenullhypothesisatasignificancelevelof0.001(Table5).However,bothvariablesemerge asimportantinpreviousliteratureandthereforetheirjointimpactwillbestudiedbelow.

Forthepurposesofexaminingthesame-sexcollaborationratiobyagegroup,wedividedoursampleintothesethreecategories: youngscientists(aged39 andyounger),middle-aged scientists(aged40–54) andolder scientists(aged 55andolder),of which middle-agedscientistsarethelargestagegroup(45.79%)(Table6).Theproportionofmalesandfemalesisalmostequalamong youngscientists—butfemalesarelessthan30%ofolderscientists(see%column).

Table7showsthedistributionofthemedianvalueofthesame-sexcollaborationratiobygenderandagegroup.Themedianratio formalesslightlydecreaseswithage(by6percentagepoints).Incontrast,thesamemedianratioforfemalessubstantiallyincreases withage(by18percentagepoints).Whiletheratioforfemalestripleswithage,itisstillverylowcomparedwiththatofmales(the differencebeing35percentagepoints).

(11)

Table6

DistributionofthesampleofPolishscientistsbyagegroupandgender. Young Middle-aged Older Total

(39 and younger) (40-54) (55 and older)

Male n 3,747 6,526 4,613 14,886 % column 51.2 56.0 71.2 58.5 % row 25.2 43.8 31.0 100.0 Female n 3,578 5,134 1,865 10,577 % column 48.8 44.0 28.8 41.5 % row 33.8 48.5 17.6 100.0 Total n 7,325 11,660 6,478 25,463 % column 100.0 100.0 100.0 100.0 % row 28.8 45.8 25.4 100.0 Table7

Themediansame-sexcollaborationratiobyagegroupandgender.

Male Female Total Z p-value Young (39 and younger) 0.5396 0.0625 0.2727 -29.676 < 0.001 Middle-aged (40–54) 0.5000 0.1818 0.3333 -28.163 < 0.001 Older (55 and older) 0.4762 0.2353 0.3750 -15.696 < 0.001 Total 0.5000 0.1538 0.3333 -44.291 < 0.001

Figure3.Thedistributionofthesame-sexcollaborationratiobygender.Thegrayareaistheoveralldistributionforbothgenders.

Thedifferenceincollaborationpatternsforyoungscientistsbygenderisinterestinginviewofpreviousliteratureaboutgender patternsofresearchcollaboration.Thisstrandofliteraturesuggeststhatwomentendtoco-authorwithwomen(Ghiasietal.,2018; Potthoff &Zimmermann,2017;Wangetal.,2019;Lerchenmuelleretal.,2019),althoughthisisnottrueinthePolishcase.Whilehalf ofyoungmalescientistswriteatleast54%oftheirpapersincollaborationwithmales,thesameindicatorforfemalesisninetimes lower(6.3%).Youngmalestendtocollaboratewithmales—andyoungfemalestendnottocollaboratewithfemales.While50%of youngfemalescientistsarecharacterizedbythesame-sexcollaborationratioatthelevelof0.06,inthecaseofolderfemales,the medianratioquadruplesto0.24:olderfemalesstilltendtocollaborateprimarilywithmales.Forallagegroups(seetheTotallinein Table7),thedifferencebygenderinPolishscienceissubstantial:whilethemediansame-sexcollaborationratioformalesis0.5,the medianforfemalesismorethanthreetimeslower(0.15).(Theseresultswillbeconfirmedinafractionallogitregressionanalysisin section4.6.)

WhatisclearinthetwopanelsinFigure3 isthepredominanceofextremevalues(0fornosame-sexcollaborationand1for exclusivelysame-sexcollaboration)inindividualpublicationportfolios.Thetotalnumberofextremevalues(0and1)issimilarfor bothgenders.Themajorityofcollaborationsaremixed-sexcollaborations.Asubstantialproportionofcollaboratingmalescientists (leftpanel,rightpeak)co-publishedmostlywithmalesinthedecadestudied;asubstantialproportionofcollaboratingfemalescientists (rightpanel,leftpeak),incontrast,tendednottoco-authorwithfemalesinthesameperiod.

Thedistributionofthesame-sexcollaborationratioforfemalesisthemirrorimageofthatformales.Apartfromthetwoextreme valuesof 1and0,thedistributionoftheratioin questionformalesisbasically uniform.Forfemales,agradualdeclineinthe ratioisclearlyobserved.Comparingtheextremes,therearemorefemaleswithoutsame-sexcollaborationthanmalesforthesame collaborationtype;thereareaboutthreetimesmoremaleswhocollaborateonlywithmalescomparedwithfemaleswhocollaborate onlywithfemales.

Whenweexamineacademicpositions,inasimilarvein,thesame-sexcollaborationratiobymalesdecreaseswiththehighest academicpositionreached(Table8).Incontrast,thesameratioforfemalesincreaseswithacademicpositions,althoughitslevelis

(12)

Table8

Themediansame-sexcollaborationratiobyacademicpositionandgender. Male Female Total Z p-value Assistant Professor 0.5263 0.1053 0.3077 -37.583 < 0.001 Associate Professor 0.5000 0.2083 0.3636 -20.695 < 0.001 Full Professor 0.3924 0.2500 0.3333 -8.840 < 0.001 Total 0.5000 0.1538 0.3333 -44.291 < 0.001

Figure5. Thesame-sexcollaborationratio:distributionbyagegroupsandgender(boxplotsandviolinplotscombined).

stillverylowforallfemales.Whilethemedianratiolevelforfemalesincreasestwoandahalftimeswhenwemoveuptheacademic ladder,itisstillmuchlowercomparedwiththatofmales.While50%offemaleassistantprofessorsarecharacterizedbythesame-sex collaborationratioatthelevelof0.105,forfemalefullprofessors,theratioincreasesto0.250.Seeagraphicalsummaryforage groupsandacademicpositionsinFigure4.

Thegenderdifferenceincollaborationpatternscanbe studiedin moredetailusingboxplotsandviolinplotscombined.The genderdifferencebyagegroup(Figure5)closelyresemblesthegenderdifferencebyacademicposition(Figure6).Femalescientists consistently,acrossthethreeagegroupsandacrossthethreeacademicpositions,tendnottocollaboratewithotherfemales(compare theshapesforRatio=1,i.e.,femalescollaboratingonlywithfemales,acrosstheagegroupsandacademicpositionsforfemales).Note thatthemedianshowninboxplotsismuchlowerforeachgroupforfemalesthanformales,anditincreasesforfemaleswithage;it isalsomuchlowerforfemaleassistantandassociateprofessorsandlowerforfemalefullprofessors.

Inverseproportionalityincollaborationpatternsbetweenmalesandfemalesisvisibleforeachagegroupandeachacademic position.Intermsofwithin-sexvariation,malescientistsaremoredifferentiatedthanfemalescientists(comparetheheightofthe boxesinthetwocolumns)foreachagegroupandeachacademicpositionstudied.Females,andespeciallyyoungfemalesandfemale assistantprofessors,tendnottocollaboratewithotherfemales.AscanbeseenfromFigures5 and6,generally,conclusionsfroma studyofagegroupsresembleconclusionsfromastudyofacademicpositions.

Whileabove,wehavestudiedthreebroadagegroups,below,wefocusonbiologicalageasanumericalvariable.Theyear-by-year approachillustratedbyregressionlinesinFigure7 generallyconfirmsthetwooppositetrendsforbothgenders,atleastuntilthe ageof60formalesandforallagesforfemales.Interestingly,thegenerallydownwardtrendinthesame-sexcollaborationratiofor malescientistsisreversedforthoseaged60andabove:theratiofortheoldestmalesincreases.Incontrast,forfemalescientists, thedampedgrowthcharacteristicofallagesuntilabout60turnsintoexponentialgrowthfortheoldestfemalescientists(acut-off pointof70isused,thestandardretirementageforfullprofessors).ThedotsinFigure7 representthemedianvalueofthesame-sex collaborationratioforeachyearofage.Relativelyhighvariationofmedianvaluesforveryyoungmalescientistsandnovariationfor

(13)

Figure6.Thesame-sexcollaborationratio:distributionbyacademicpositionandgender(boxplotsandviolinplotscombined).

Figure7.Thesame-sexcollaborationratiobygenderandage.Theregressionlinewasestimatedusingthemethodoflocalpolynomialregression fitting.Thegrayarearepresents95%confidenceintervals.Eachyearofageisrepresentedbyasingledot(acut-off pointof70isused).Dots representmedianvalues.

veryyoungfemalescientists(seetherespectivedotsinbothpanels)iscausedbythelownumbersofscientistsintheseagegroups. Thus,Hypotheses2and3areconfirmedformalesbutnotforfemales.

4.3. Thesame-sexcollaborationratiobyacademicdiscipline

Hypothesis4. Wewouldanticipatethatthesame-sexcollaborationratioishigherinmale-dominatedacademicdisciplines (con-firmed).

First,weexaminedthecorrelationlevelbetweenthemeansame-sexcollaborationratio(rangingfrom0to1)andthepercentage ofmalescientistswithinthediscipline(seeFigure8).Thecorrelationbetweenthetwovariablesisweak(r=0.228,R2=0.052);

(14)

Figure8. Correlationbetweenmeansame-sexratioandpercentageofmenintheASJCdisciplines(bubblesizereflectsthenumberofscientists).

Figure9. Thesame-sexcollaborationratio:distributionbydisciplineandgender.

twocategories:female-dominated(leftoftheverticaldottedlineindicating50%)andmale-dominated(rightofthelineandonthe line,byourdefinition;seeTable1 forthevariables).Thebubblesizereflectsthenumberofscientists.Infivedisciplines(CHEM, ENVIR,ECON,SOC,andHUM),thepercentageofmenisverycloseto50%.Thehighestmeansame-sexcollaborationratioisnot correlatedwiththemaleandfemaledistributionwithinadiscipline:itisequallyhighforphysicsandastronomy(PHYS)andcomputer science(COMP),inwhichmaleparticipationexceeds80%,asitisforpharmacology,toxicology,andpharmaceutics(PHARM)and biochemistry,genetics,andmolecularbiology(BIO),withmaleparticipationinthe30–40%range.Atthesametime,whilesocial sciences,artsandhumanities,andeconomics,econometrics,andfinance(HUM,SOC,andECON)exhibitameansame-sexratioof around0.5amongthefivegender-balanceddisciplines(thosewithcloseto50%maleparticipation),chemistry(CHEM)exhibitsa ratioofaround0.7.

Thesame-sexcollaborationratiodiffersvastlybydisciplineandbygender.Previousresearchshowsthatasthefractionoffemale researchersinadisciplineincreases,womenincreasinglytendtopublishwithotherwomen;also,themaleratiotoco-authorwith womenishigherindisciplineswithmorewomen(Boschini&Sjögren,2007,p.339).Agoodwaytovisualizegenderdifferencesin themediansame-sexcollaborationratioisthroughaheatmap(thecolorpaletteinTable9 changesfromlightblueforlowvalues todeepblueforhighvalues).InthecaseofCOMP,ENG,andMATH,withthehighoverrepresentationofmalescientists,theratio formalesisextremelyhigh(andthemedianvaluesreachthelevelof1oralmost1).Thatistosay,atleasthalfofmalescientists inthesedisciplinescollaborateonlywithmales.InCOMP,ENER,ENG,HEALTH,PHYS,andVET,atleasthalfoffemalesdonot collaboratewithfemalesatall(andthemedianvaluesreachthelevelof0oralmost0).Incontrast,indisciplinessuchasPHARM, PSYCH,andSOC,themedianvalueforfemalesissignificantlyhigherthanformales.ThemedianlevelbyASJCdisciplineisalso showngraphicallyinboxplotsinFigure9.Thus,Hypothesis4isconfirmed.

4.4. Thesame-sexcollaborationratiobyinstitutionaltype

Hypothesis5. Wewouldexpectthatthesame-sexcollaborationratioishigherinresearch-intensiveuniversities(confirmedfor malesbutnotforfemales).

Previousliteratureindicatesdifferencesingenderhomophilyinresearchcollaborationnotonlybydisciplinebutalsobyinstitution. Therefore,wewilltestwhetherthesame-sexcollaborationratioalsodiffersbyinstitutionaltype:wecontrastthe10research-intensive institutionswith75otherinstitutionsinthenationalsystem.The10institutionsaretheIDUB(or“ExcellenceInitiative–Research

(15)

Table9

Themediansame-sexcollaborationratiobydisciplineandgender(shading:fromthehighestratioindark bluetothelowestratioinlightblue).

University”)institutions,whichwereselectedforadditionalresearchfundingforthe2020–2026period.TheIDUBinstitutionsinclude bothtopPolishuniversitiesandpolytechnicinstitutes(similarresultswereachievedforthetop10institutionsintermsofpublication numbersoverallandpublicationnumberswithintheScopus90th–99thjournalpercentiles).

FormalescientistsemployedintheIDUBinstitutions,theratioishigh:theproportionofarticlespublishedonlywithmalesby theupper50%ofmalescientistsisatleast60%andislargerthantheoverallratioformalesinthesystem(seetheTotallinein Table10:50%).Forfemalescientists,incontrast,thesameproportionintheIDUBinstitutionsismorethanfourtimeslowerandis evenlowerthantheoverallratioforfemalesinthesystem.Inotherwords,wereachthesomewhatsurprisingconclusionthatfor males,theproportionofall-malecollaborationinindividualpublicationportfoliosishigherinresearch-intensiveinstitutionsthanthe alreadyhighproportionforallinstitutions—whileforfemales,theproportionofall-femalecollaborationislowerinresearch-intensive institutionsthanthealreadylowproportionforallinstitutions.

(16)

Table10

Themedianofthesame-sexcollaborationratiobyinstitutionaltypeandgender. Institutional type Male Female Total Z p-value Research-intensive (IDUB) 0.6000 0.1348 0.4138 -30.717 < 0.001 Rest 0.4444 0.1667 0.2857 -31.992 < 0.001 Total 0.5000 0.1538 0.3333 -44.291 < 0.001

Figure10. Thesame-sexcollaborationratio:distributionbyinstitutionaltypeandgender(boxplotsandviolinplotscombined).

InthePolishacademicsciencesystemasawhole,thesame-sexcollaborationratioformalesismorethanthreetimesthatfor females(afindingwhichisconfirmedbyafractionallogitregressionanalysisinSection4.6 below).Figure10 showsthegender differenceinthemedian same-sexcollaborationratiobyinstitutional typeandgenderinmore detailusingboxplotsandviolin plotscombined.Thedistributionofthemedianratioforfemalesisbasicallythesameinbothinstitutionaltypes,andthewithin-sex variationismuchhigherformalesthanforfemales,asindicatedbytheheightoftheboxplots.Thedifferencebetweenthemedian valuesformalesandfemalesismuchlargerinthecaseofresearch-intensiveinstitutions;themedianvalueformalesismuchhigher intheseinstitutions,asitisforfemales.

Thiseffectivelymeansthatinresearch-intensiveinstitutions(seethetopIDUBpanelinFigure10),malesaswellasfemalesare morelikelytocollaboratewithmales.Genderhomophilyisthusstrongerformalesandweakerforfemalesinresearch-intensive institutions.Inotherinstitutions(seethebottompanel),thenumberofmalescollaboratingexclusivelywithmalesandthenumberof malescollaboratingexclusivelywithfemalesareequal;thenumberoffemalescollaboratingexclusivelywithmalesandthenumber offemalescollaboratingexclusivelywithfemalesaresimilarinbothinstitutionaltypes(seethelargebaseonwhichthetworight columnsrestforfemalescientistsinbothpanels).Thus,Hypothesis5isconfirmedformalesbutnotconfirmedforfemales. 4.5. Thesame-sexcollaborationratiobyjournalprestige

Hypothesis6. Wewouldexpectthatthejournalprestigelevelofmixed-sexpublicationsishigherthanthatofsame-sexpublications forbothmaleandfemalescientists(confirmed).

Boththequantityandqualityofoutputin academiaarerelativelyeasilymeasured (withallstandardlimitations) usingthe Scopusdatabaseasarticlesarepublishedinjournalsofdifferentranks.Thescientistsinoursamplehavetheirownuniqueindividual publicationportfolioswithpublications,translatableintoaverageindividualprestigeviaScopuscitationmetrics.Theprestigeof eacharticleinthisportfolioisderivedfromtheprestigeofthejournalinwhichitwaspublishedandisdefinedbythepercentile rankascribedannuallytoeachacademicjournalwithinitsASJCdiscipline.Topjournals,includingtheJournalofInformetricsand Scientometrics,areusuallyrankedintheupper5%ofjournals.

(17)

Table11

Themedianprestigeleveldistribution(bypercentilefrom0–99,withthe99thpercentile beingthehighest)ofpublicationsbymajorgendercollaborationtypeandgender.

Mixed-sex collaboration Same-sex collaboration Solo research (zero collaboration) Male 62.50 59.17 50.00 Female 62.20 58.00 46.50 Total 62.42 58.27 48.50 Z -1.497 -5.981 -5.121 p-value 0.134 < 0.001 < 0.001

Figure11. Theprestigeleveldistributionofpublications(byScopuspercentilerankfrom0–99,withthe99thpercentilebeingthehighestin prestige)bymajorcollaborationtype,gender,anddiscipline.

Importantly,thecitation-basedpercentilerankingsystemusedbyScopusisbeingsystematicallyusedinPoland,forinstanceina complicatedsystemofindicatorsusedfirsttoselect(in2019)andthentoadditionallyfinance(in2020–2026)10research-intensive Polishuniversities.Weusedthemeasureofaverageprestige,whichrepresentsthemedianprestigevalueforallpublicationswritten byagivenscientistinthestudyperiodof2009–2018forthreecategoriesofpublications(same-sex,mixed-sex,andsolopublications). ForjournalsforwhichtheScopusdatabasedidnotascribeapercentilerank,wehaveascribedthepercentilerankof0;Scopusascribes percentilestojournalsinthe25thto99thpercentilerange,withthehighestrankbeingthe99thpercentile.

Themedianprestigelevel(inarangeof0–99)forallPolishpublicationswritteninsame-sexandmixed-sexcollaborationby genderdoesnotdiffermuch(Table11):themedianvaluesforall-malepublicationsandall-femalepublicationsbygenderarealmost identical(59.17and58.00,respectively).Also,themedianvalueformixed-sexcollaborationsdoesnotdiffersignificantlybygender. Bothmalesandfemales,onaverage,regardlessofthecollaborationtype,publishinjournalswithrelativelylowprestige.Articles writteninmixed-sexcollaborationare,onaverage,publishedinmoreprestigiousjournalsthanthosewritteninsame-sexcollaboration andinmuchmoreprestigiousjournalsthansoloarticles(seetheTotallineinTable11).

Thedistributionofthemedianjournalprestigelevelbydisciplineandcollaborationtype(mixed,same-sex,andsolo publica-tions,separatelyformalesandfemales)showsbothcommonpatternsandsubstantialvariations.Generally,foreachASJCdiscipline (Table12),soloresearchischaracterizedbythelowestprestigelevel.BIO,CHEM,ENER,andPHARMbelongtodisciplineswiththe highestmedianprestigelevel,regardlessofthecollaborationtype.Bothmixed-sexandsame-sexcollaborationshavehigheraverage prestigelevelsthandosoloarticles.

Thedifferences inprestigelevelbygender areasfollows:formixed-sex collaborations,theyaremarginal,butfor same-sex collaboration,they aresubstantial (comparethesame-sexcollaboration columnsformales andfemales inTable 12).Male-only collaborationshavehighermedianprestigethandofemale-onlycollaborations,andthispatternischaracteristicofalargenumber ofdisciplines.Malescollaboratingwithmales,onaverage,publishinmoreprestigiousjournalsthandofemalescollaboratingwith females.Soloresearch byfemalesexhibitslowermedianprestigelevelsthandoessoloresearchby malesinall exceptfornine disciplines(includingBIO,CHEMENG,ENER,ENG,MATER,MED,andPHARM).ThemedianprestigelevelbyASJCdisciplineand genderisalsoshowngraphicallyintheboxplotsinFigure11 togobeyondthemedianvaluesandtohighlightintra-disciplinary cross-gendervariability,withthreeseparatepanelsforthethreegender-definedcollaborationtypes.Thus,Hypothesis6isconfirmed. 4.6. Amodelingapproach:Afractionallogitregressionmodel

Hypothesis7. Inafractionallogitregressionmodel,we wouldanticipatethatindividual-levelindependentvariablesaremore influentialthaninstitutional-levelindependentvariablesinpredictingthesame-sexcollaborationratio(notconfirmed).

(18)

Kwiek and W. Roszka Journal of Informetrics 15 (2021) 101171 Table12

Themedianprestigelevelforpublications(byScopuspercentileranksfrom0–99,withthe99thpercentilebeingthehighestinprestige)bymajorcollaborationtype, gender,anddiscipline(shading:fromthehighestmedianprestigelevelindarkbluetothelowestmedianprestigelevelinlightblue.

(19)

Table13

Fractionallogisticregressionmodelstatistics,dependentvariable:thesame-sex col-laborationratio(N=21,467). R 2 = 0.159 Estimate Robust t value Pr( > |t|) VIF std. errors (Constant) -0.873 0.096 -9.087 < 0.001 Age 0.005 0.001 3.497 < 0.001 2.030 Male 0.890 0.024 36.760 < 0.001 1.147 IDUB 0.214 0.031 6.851 < 0.001 1.823 Full Professor -0.250 0.043 -5.820 < 0.001 2.061 Associate Professor -0.060 0.029 -2.102 0.036 1.391 STEM discipline -0.174 0.055 -3.136 0.002 1.037 Mean prestige points -0.002 0.001 -2.260 0.024 1.078 Number of employees -0.00004 0.00002 -2.509 0.012 1.798 Male-dominated discipline 0.744 0.023 32.050 < 0.001 1.133

Finally,wemovefromdescriptivestatisticsandtwo-dimensionalanalysistomodeling,andweusearegressionmodelfora frac-tionaldependentvariable—afractionallogitregressionmodel(Papke&Woolridge,1996),designedforvariablesboundedbetween zeroandone(aswithourdependentvariable:thesame-sexcollaborationratio).Linearmodelstoexaminehowasetofexplanatory variablesinfluencesagivenproportionorfractionalresponsevariablearenotappropriatehere(Ramalho,Ramalho,&Murteira, 2011,p.19).Inthismodel,nospecialdataadjustmentsareneededfortheextremevaluesofzeroandone.

Inourcase,wehave24ASJCdisciplinesrepresentedinour85research-involvedinstitutions.Thenumberofemployeesand thepercentageoffemalescientistsvaryineachofthem;eachdisciplineineachinstitutioniseithermale-dominated(i.e.,with ex-actlyormorethan50%malescientists)orfemale-dominated(i.e.,withmorethan50%femalescientists).Wealsohaveasetof10 highlyresearch-intensiveinstitutions(termedIDUBinstitutions)andonecontainingtherestofthem.Individualscientistsare em-beddedintheirinstitutionsandintheirdisciplines,andbothinstitutionsanddisciplineshavetheirspecificpatternsofcross-gender collaboration.Insomedisciplinesandinstitutions,same-sexcollaborationismoreprevalentthaninothers.Forthesakeofclarity, hereisanexample:asingleobservationhereisnotamalemathematicianwithindividualfeaturesonly,suchasbiologicalageand academicposition.Thismalemathematicianisalsoembeddedinahighlyresearch-intensiveinstitution(variable:IDUBtype) employ-ing2,000teachingandresearchfaculty(variable:numberofemployees)andpublishinginthedisciplineofmathematics(variable: STEMdiscipline),whichismale-dominated(variable:male-dominateddiscipline).Furthermore,inthisinstitution,customarily,male mathematicianstendtohavethehabitofpublishingwiththeirmaleratherthanfemalecolleagues.

Intheregressionmodel,wealsoincludethemeanindividualpublicationprestigepercentile,whichrequiresaclearexplanation asitdiffersfromtheprestigeattachedtoanindividualarticle(anditmaybeconsideredtobeaconsequenceofthecollaboration ratherthanacauseofthecollaboration).Collaboration,asdefinedinthispaper,isconsideredtobeaproductratherthanaprocess: collaborationbetweentwoscientistsisviewedonlythroughtheproxyofthepapertheyco-authoredandpublished.Byourdefinition, everyscientistinourdatasethashisorherown,clearlydefinedmeanindividualpublicationprestigepercentile,whichisdetermined bytheentiretyoftheirScopus-indexedpublicationoutputfromthedecadestudied(eacharticleislinkedtoitssourceorjournal, withaclearScopus-calculatedhighestjournalpercentile).Consequently,themeanindividualpublicationprestigepercentilefor eachscientist(withthe99thpercentilebeingthehighest)isanindividual-levelpredictor:itisaproxyfortheaverageprestigeof

theirmaximallydecade-longpublicationhistory.Itishigherforscientistspublishingexclusivelyintopjournalsandlowerforthose publishinginacombinationoftopandsecond-tierjournalsorinsecond-tierjournalsonly,asrankedbyScopus.

Asourdependentvariableisfractional(rangingfromzerotoone),weestimateafractionallogitregressionmodel.Weestimate oddsratiosforconductingsame-sexcollaborationinjournalpublishing,i.e.,publishingwithscientistsofthesamesex.Wecalculate thesame-sexcollaborationratioasthepercentageofsame-sexcollaborationarticlesinallofthepublishedcollaborativearticlesin allofthescientists’individualpublicationportfolios.Usingafractionallogisticregressionapproach,weestimatedtheprobabilityof conductingsame-sexcollaboration.

Thefirsttypeofindependentvariablecapturesscientists’individualdemographic,biographical,andbibliometriccharacteristics: gender,biologicalage,meanindividualpublicationprestigelevelwithinthestudyperiodof10years(orless),currentacademic position,andthetypeofthedominantScopus-definedASJCdiscipline(STEMornon-STEM).Thesecondtypeofindependentvariable capturesthreemajorinstitutionalcharacteristics:abinaryvariableindicatingemploymentinanIDUBornon-IDUBinstitution(being employedfull-timeinoneofthe10highlyresearch-intensiveinstitutionsornot),thenumberofscientistsemployedintheauthor’s institution(inFTEsin2018),andpublishinginamale-dominateddisciplineornot.Thenonexistenceofcollinearityoftheindependent variableswasconfirmedthroughananalysisofVIFcoefficients(seeTable17 inDataAppendices).Althoughthecorrelationtable ofindependentvariablesshowsinsomecases(e.g.IDUB– thenumberofemployees,fullprofessor– age)apairwisecorrelation ofmoderatestrength,thevectorofindependentvariablesisnotcharacterizedbysignificantcollinearity,asindicatedbytheVIF coefficients(Table13).Thecorrelationbetweenthesepairsislargelycontrolledbyothervariablesinthemodel.

Thedistributionofresidualsinourdatasetwasnotnormal(i.e.,theK-Snormalityteststatisticisequalto0.104,withap-valueless than0.001).Thenormalityofresidualsdistributionallowsperformingstatisticalinferenceonthemodelpropertiesasallstatistical significancetestsassumethenormalityofdistribution.Toovercomethemodel’sinconsistencywiththeassumptions,robuststandard errorswereestimated,and,onthebasisoftheestimates,asignificancetestforindividualcoefficientsinthemodelwasconducted.

Cytaty

Powiązane dokumenty

Combining the surface ornaments that are demon- strated here with the lattice folding patterns, it is possible to create hierarchical lattice structures that also cover the

In the next step, the candidates coming from particular senate districts and their weights resulting from the positions on the list was determined for

No breaches No No No No.. However, at the most downstream location IJsselcentrale the negative effects on water levels never outweigh the positive effects, only

Male respondents were asked: ‘Have you ever noticed that in negotiations with women, first of all you value them as women and only later as a negotiation partner?’ 13% of

The aim of this article is to indicate the role of gender stereotypes in management and their significant impact on the perception of leaders and on the management styles

Rather, our work draws from the long tradition in Critical Theory to test for the presence of authoritarian personality traits among the traditional unionised working class

Nazwa renesans (z włoskiego rinascita, rinascimento, z francuskiego la renaissance) użyta po raz pierwszy przez włoskiego malarza i architekta Giorgio Vasariego,

Ponieważ obecność bibliotek szkół wyższych niepaństwowych na rynku usług edukacyjnych nie jest zja- wiskiem marginalnym, zasadna wydaje się analiza prowadzonej przez nie