This is a chapter of a published dissertation: Situating Web Searching in Data Engineering: Admissions, Extensions, Repairs, and Ownership.

Outside the search confessions, the broader practices that support web search, and the shared repairing, the data engineers report experiencing and discursively defend web search as a solitary and private professional endeavor. Why does search remain a private and solitary practice given the rich trace data generated by search, the rapacious appetite for data collection and analysis among the data engineering community and the companies in which they practice? And what allows this community to avoid the scrutiny generally deemed essential to self-reflection, optimization, efficiency, and innovation?

I will show data engineers describing their web searching as being solitary, speedy, and secret. I show two aspects of how data engineers talk about searching solitary, solo, or alone. First, describing search as though the activity of web searching is wholly their responsibility, to be completed by the individual without or apart from the support of colleagues—by oneself. Second, describing search as apart from others entirely, suggestive of an apparently autonomous individual—on one’s own. Then I discuss two further descriptions: data engineers’ interest in speed, in search being fast, and how data engineers talk about wanting to keep their searches private, that they would be embarrassed to share them.

Several frames from related literature suggest different interpretations of the solitariness and privacy of the data engineers’ web search activity. I look at the design of search, constructed but now default expectations of (perceived) privacy, both “rugged individualism” (Ensmenger, 2015) and norms around generalized reciprocity (Coleman, 2012, Weber, 2004) in the coding professions. These all contribute to an understanding of the solitary and secretive searching. Then I ground my argument in work on the value of privacy for learning and on the learning strategies of the organization.

I show how the firm delegates the task and practice of search to such a degree that it has foregone ownership of searching. As a consequences the already marginalized suffer, from the current informality workplace web searching in data engineering and what that hides, organization has poorer learning at the expense of preserving the status quo.

Solitary, speedy, and secretive searching

Solitary searching

The data engineers offered different motives for solitary searching. Some reported being expected to search alone, while others reported needing, or wanting to search alone. The examples in the two subsections below detail two aspects of a standing repertoire for referring to web searching. The repertoire suggests particular orientations towards or understandings of web search. In [by oneself] I’ll share examples of how data engineers talk about searching as a responsibility to be performed apart from the immediate support or involvement of their colleagues. In on one’s own , I’ll show how they talk of search as apart from others entirely, where the standard references to the search activity do not bring to mind a recognition of any relation to others who might be preparing material to be found or to those who searched before, shaping the results they will see.

[by oneself]

Data engineers spoke of searching, finding, teaching, or figuring by themselves, yourself, or oneself. Sameer said he’d carefully suggest searching to new engineers⁹⁸ : “‘hey, have you tried googling it because it seems like a very simple thing you can find yourself.’”

Devin discussed self-reliant searching as a key strategy for dealing with the diversity and complexity of the tooling environment for data engineering:

you really have to … figure out those tools yourself.

When somebody says to go and google it, to get that answer, they also expect you to know what is a good answer, what is a bad answer. [ . . . ] I think as a data engineer you are expected to know what’s good and bad. So I don’t think there is a problem with saying ‘google it’.

While Aditya was hesitant to place much emphasis on being able to search, what he described fits well within this narrative that individual engineers are expected to be able to independently search for (or ‘find’) information:

I wouldn’t say that the expected skill is that you know how to search. Like, I think that’s almost like not something someone focuses on. But that you can— teach yourself or find yourself the information to learn some skill and, like, how you do it could differ.

and learn independently and on the fly: “I expect everyone to have gaps and as the problem arises, we, basically say:”Teach yourself what you need to solve that problem."

These comments convey the role and perceived value of autonomy and individualization of search work in data engineering practice. It isn’t just that the engineers are expected or allowed to search the web on their own, they need to. Moreover, they also must be able to choose means besides searching the web necessary to do their work. These descriptions of why search is solo and the value of it, are distinct from the practical and spatiotemporal factors that shape search as a solitary exercise described in Admitting searching ). This language frames and justifies solitary, or more specifically autonomous searching, in terms of values—efficiency, flexibility, adaptiveness.

on one’s own

Just above, I introduced a sense that the activity of web searching is responsibly completed by the individual, without or apart from colleagues. Now note herehow that ostensibly solitary work is described. Theon [your/their/my] own phrasings are suggestive of an apparently autonomous individual (absent relations to others)⁹⁹ .

I asked Michael how he thinks through the process of considering whether to ask a colleague or manager a question when feeling unable to find some bit of information. I was asking here particularly not about internal tooling (as interviewees commonly spoke of that as something they’d ask questions about internally). He said most of them time he’d ask questions about internal tooling or project-specific components he was interacting with. But, regarding “general development” work (his language), he discussed a sense of individual responsibility he learned from his mentors, managers, and peers (what he also considered “a common practice within the industry”).

For more general development, development where it doesn’t matter what the component is, its more of something that you as an engineer should be able to solve on your own.

It is definitely different for other engineers, but from from mentorship and my managers, and other peers, it’s more of try it out on your own, try to solve it the best you can, and keep searching, and spend all your resources first, and then go to your team if you’re actually truly stuck and you can’t figure out something. As an engineer don’t depend too much on hand holding. But if I am stuck on something then I go to more senior technical leaders.

And I say, ‘OK, I can’t figure this out can you help me debug this’ and sometimes that works very well because they can see things that I didn’t see or I skipped over, et cetera.

I do think that is more of a common practice within the industry and just engineers in generally. Is sort of: Be a ‘go getter’ and try to solve it on your own

And also discern whether or not you’ll be able to do it quickly and then, if not, seek help, et cetera.

But then that’s obviously a problem because if you are trying to solve it on your own through web search, etc., you’re also depending on verified, validated, quick results to come up where it would help you in a way where you could actually solve it on your own.

I’ll flag here some of the language he used:

“solve on your own”
“try it out on your own”
“try to solve it on your own”
“trying to solve it on your own through web search”.

He’s describing an expectation set by others that he has adopted for himself. This is an expectation, or an evaluative criteria, I found throughout my interviews. Yes, data engineers are physically remote from each other. However, the language of the interviewees describe search as performed alone in a distinct way. They position themselves in dialogue with an inanimate corpus of material or perhaps the search engine, while in reality searches connect data engineers with searchers across the World Wide Web, and across time and space. Ironically, while they describe search being performed alone they often include many references to actual and anticipated interactions with other people who variously constrain, compel, or coach searching. These people may also be distant in time and space, but they are closer to the heart and mind then the searchers hidden behind the screen.

Sometimes the solitary phrasings were only used to indicate the lack of search talk.¹⁰⁰ I asked Shreyan: “Do you talk about searches with your co-workers?” He replied: “they would search on their own”. There may be this larger solitary assumption underlying that remark, but it was principally addressing search talk.

Jamie used ‘on my own’ phrases to refer simply to not asking colleagues for help:

“I will try to get as far as I can on my own”
“I wanted to attempt to solve the problem on my own”

Similarly, when Jillian shared asking her colleagues questions, she said that while they are very forgiving and nice and would want her to ask questions, she generally does what she says they probably wouldnot want her to feel obligated to do:

find the answer first, try and figure it out on your own. And then ask if you’re having an issue figuring it out. Or if you know you’re not going to be able to figure it out.

There is a general obligation that data engineers will take responsibility to search first. Some of what Jillian was discussing was a generally noted tendency to be so hesitant to ask for help, or being so used to turning first to the search engine, that data engineers will get lost in rabbit holes, searching repeatedly without forward progress. It is those long searches that Jillian’s colleagues express an interest in avoiding. The “obligation to know”, to have searched and to search, “exists in tension with the expectation of asking questions” (Reagle, 2016, p. 698). Some of this solitary searching is driven by anticipated negative interactions providing a renewed impetus to search (discussed in the last chapter). Again, while theon [your/their/my] own phrasing above may at the first level refer to the data engineer searching apart from their colleagues active and in-the-moment engagement or presence, they of course reveal deeper interconnection because web search engages with aweb, a network of people and other actors.

Phillip, discussing not ever having asked a question on Stack Overflow, said he didn’t know how long it would take¹⁰¹ , and: “I just try to figure it out on my own.” (thought still making use of, among other resources, Stack Overflow and questions and answers from others).

Kari described the progression from a novitiate heavily reliant on others to being “fully functioning on your own”. Kari uses “figure things out yourself” and “on your own” language in reference to searching the web. I asked her about talking about web search with newcomers to her organization. She described how she would underemphasize searching the web when onboarding new engineers onto her team (she’s also referring back to a comparison when she said she encourages the data scientists asking her questions to search first, and contrasting with how she described her own search work practices).

One of the first things that I’ll tell people is, for search specifically, I usually say the opposite for the people I work with. Ask a ton of questions and don’t try to figure things out yourself. I wouldn’t say that to a data scientist or every person at the company. But I think it is good at the beginning to really support new coworkers. Make sure that they don’t feel stupid for asking questions because it is expected that they don’t know. We don’t want people to get stuck in a hole, alone, and stuck on something… trying to figure it out by themselves when they can rely on people. In your first year at a company it is important to have support from everyone else and then you are fully functioning on your own and you’re not really going to need that as much anymore.

These excerpts affirm (and, if they faithfully represent comments within the workplace, reproduce or reinforce) the individualizing mythos of autonomous searching and an explicit individual rather than interdependent assignment of responsibility. (This is despite the actual interdependence in web searching and the many references to others influence and shaping of searching.)

Speedy searching

Data engineers value speed. Their descriptions of the performance and articulation of web search highlights this interest in speed. The data engineers spoke of web search as providing the quickest or fastest way to answer their question.

At the very start of the interview, Ross noted that in his field “there are so many technologies, languages, frameworks, software packages, that you can’t know it all”. This research was “really interesting” because, he said, “being able to quickly find information is key to my job.”

Noah:

There could be a, some docs, I know exactly what I want but the fastest way to get it is just searching and- and try to get the first result so I use it to get to- to navigate documentation.

Mentioned previously¹⁰² , Shawn spoke of quickly resolving problems by searching the exception message:

And that will cover you very quickly in about 90% of the time. You usually find the answer. And you can try it out, run a couple quick tests on it and make sure it actually works. And then you can move on with your day.

John said that often times using web search he’s looking for the “quickest way”: [ . . . ] I just need to—quick—glance at something to find a typo or identify an error.

Later he said (discussing whether his simple queries were embarrassing):

That’s what I do. I’d rather search it than try the [SQL] query, see that it erred, read the error, debug it.I’d rather quickly look it up often times.

Phillip said he’d never asked a question on Stack Overflow because he’s “not sure fast the response would be.”

So searching is sometimes faster than trying the solution out.

Data engineers value speed (or local efficiency) in their use of web search.

A comment from Nisha underscores this. She’s responding to a question from me about whether she’ll read through the documentation for a tool or library when she’s stuck. She says instead she’d do web searches or reach out to the internal customer support team because that’s “the fastest way”. Reaching out to support is ‘ask[ing] someone for help’, but web search is not described that way:

I knew that reading documentation, if I got stuck on a specific area, was not going to add much. So I would either do a Google search, or if that didn’t help I would reach out to customer support. Scan through the documentation, obviously, do a search on the keywords in the documentation, orreach out to support because that’s the fastest way of getting my information. Rather than, you know the general framework, right, when you’re cycling, if a keyword search will not help, if you do not find anything in the documentation, reading through the whole documentation will just waste your time. It is better to just ask someone for help.

Searching speed may not only be about clock-time.

Sometimes the speed of searching is linked explicitly with the ease. Here’s Ajit:

I do remember, oh, I did something similar. But instead of actually going back and looking into the code where this happened, it’s much easier for me to just do a quick web search.It’s “more straightforward” and “much easier”, to do a quick web search.

The speed is partially necessary because of how they are expected to learn on the fly, or just in time—as Aditya said:

I expect everyone to have gaps. And as the problem arises, we, basically—Teach yourself what you need to kinda solve that problem. Rather than let’s proactively try to just fill gaps across the place. there are going to be gaps

Secretive searching

Searching is also secretive, or kept secret. There is a deep intimacy to searching the web that carries over to workplace web searching. Data engineers indicate a strong desire to keep their searching secret. This is also a motivation for searching alone.

As discussed throughout, there is limited talk about search. I asked Nisha if people ever talked about search. She just shook her head no, while laughing and smiling.

Recall Amar, who said, “it’s kind of like 90% of my job to just look things up.” He said:

engineers—at least in my experience or at least within my team—will not explicitly discuss their process

I asked Shreyan if he talked about searches with his co-workers and he said it happens very little, saying “I don’t want to restrict their pattern of thinking¹⁰³ .”

Jillian was a new data engineer when I first talked with her, only a few months on the job. In talking about embarrassment about what she thought was an excessive reliance on web search, she said: “I would assume that I am searching things far more frequently than my peers.” About talking about searches, Jillian said:

I don’t think I necessarily talk about it with them. I feel like I try and hide. I feel like I know very little and try and hide that from my peers. I don’t want them to know how little I feel like I know. Let’s just say I wouldn’t want them to see my search history of my coding related things.

In a member check a year later, she no longer thought she was searching more than others:

I am realizing that whenever I ask questions to people who I deem as smart or intelligent, I’m now really realizing that the skills that they have are the ability to quickly search.

They don’t just sit and think through it to give me an answer. They immediately go to their computer, maybe it’s searching code if it’s clearly not something that’s going to be on the internet, but if it’s going to be on the internet, they’re really googling it.

It’d be a big task to turn the whole narrative of googling things being an embarrassing thing, maybe to being a very admirable task, but I do start to recognize that, okay, people that I want to be like are definitely just constantly looking things up.

But the collective narrative hadn’t shifted.

One of the worst things I could have imagined happening, for my job, would be if people could see my search history, because then it exposed all the things I didn’t know.

And I think the flip side of that is it would probably actually just expose maybe the things I’m going out and trying to learn and understand better, but you don’t necessarily know.

The secrecy around search shapes people’s understanding of its use, driving misperceptions that lead to shame (Jillian definitely was not searching more than others) and hiding the value of searching the web and what it can support.

With the description of these three findings in mind, and before proceeding to the discussion of them, I will next describe what I looked for but did not find: technocratization of search.

Technocratization of search

While doing this research there was an ever-present question on my mind: are or might companies monitor and manage the web searching practices of their workers in order to improve performance? Would data engineers turn their skill at data analysis upon their own workflows, to improve their own work and potentially competitive standing, or to collectively optimize data engineering work? On the other end, might companies or collectives of professionals develop tools to share search learnings or regularize search strategies through more structured and automated means?

I imagined that I might find, or research such as mine might unintentionally encourage, surveillance or control of web searching. I was concerned with attempts to surveil or supervise search in ways that reduce the autonomy of workers. Surveillance or control might harm people, violate rights, limit learning and even undermine efficient use of search.

I use technocratization of search to group various practices that I anticipated I might find. By technocratization of search, I mean the intentional application of techniques to influence search practices. By its similarity to ‘technocracy’, I hoped to convey some notion of “rule by experts” through the development of surveillance or quantification, processes or routines, or built artifacts that might shape searching practices. I imagined this might include the automation of portions of the search process (perhaps the gap-bridging studied by Bailey and Leonardi, see below) or modification of browsers to constrain or encourage particular behaviors. I imagined technocratization of search as consisting of tools for logging searches, tools for facilitating searches from the editor, tools for warning someone about the length of a search session, or tools for removing some websites from search results.

While it is distinct from the sort of technocratization of web search I was looking for, some have speculated that code generation tools my serve as a substitute for web search in coding work. It is easy to find claims on social media of people saying that their use of a code generation tool has or will replace their use of web search for their coding work. The development to new code generation tools expanded rapidly during the course of my writing. While this was not the focus of the research, its relevance is undeniable. I have prepared an appendix reflecting on such tools: Appendix III. Code Generation Tools and Search .

The technocratization question comes amidst hopes and fears in regard to the political and economic power of companies wielding data and computation (ex. futures of work oriented popular press books like ‘Second Machine Age’, ‘Fourth Industrial Revolution’) and research examining the underlying mechanisms possibly explaining company strategies (here, Fourcade & Healy (2017) re “data imperative” and Zuboff (2015) re “logic of accumulation”) and looking at the introduction of new technologies into work processes (ex., the buffering and resistance detailed in Christin (2017) ).

I initially drew on the language ofinformating from Zuboff (1988). Zuboff juxtaposed machines and automation (to automate) with information technology and informate. She argued that information technology “both accomplishes tasks and translates them into information.”

Information technology not only produces action but also produces a voice that symbolically renders events, objects, and processes so that they become visible, knowable, and shareable in a new way. [ . . . ] The word that I have coined to describe this unique capacity is informate. Activities, events, and objects are translated into and made visible by information when a technology informates as well as automates."

Zuboff argued that:

[. . . ] [W]hen the technology also informates the processes to which it is applied, it increases the explicit information content of tasks and sets into motion a series of dynamics that will ultimately reconfigure the nature of work and the social relationships that organize productive activity

In addition to attending to the potential collection of data on searching or application of that data to influence searching, I also looked for automation of portions of the web search activity.¹⁰⁴ A thread of research on bridging gaps between two technologies (Bailey et al., 2010, Bailey & Leonardi, 2015) and the imbrication of routines and flexible technology (Leonardi, 2011) describe situations where tools are sometimes created or adapted to improve coordination between people and their tools. Would I find gap-bridging in the data engineer web search practices?

The question can also be formulated as: Where is the data gathering (or the surveillance and informating, the logic of accumulation and the data imperative) and the gap-bridging to improve the labor process in the work of data engineers? I did not find such informating of web search or accumulating of search logs. Why do I seem to find so little technocratization of web search?

It may be suggested that the search work of the data engineers is a tacit skill (see also the discussion of searching as tacit-knowing in Admitting searching : Talk about search ), not easily explainable nor reducible to programmed instruction. My questions about the lack of technocratization, though, don’t suggest or expect automation of the search work. Shestakofsky (2017) ’s research identifies how “[i]n some instances, workers’ tacit skills [give] them an advantage over computer code in performingnonroutine tasks [ . . . ] because people possessed competencies grounded in tacit knowledge that could not easily be programmed [emphasis in original]” (p. 387). A significant amount of the search work may be nonroutine, but I was not looking for the tacit knowledge of the data engineers to be automated by machine, rather for any intentional application of technique to influence the searching of the data engineers.

While it may be the case that the search activity of the data engineers is intuitive and inexplicable even if noted or remembered (Dreyfus & Dreyfus, 2005) , the question here is not about the development of a rule-based expert system, but why responsibility has been completely handed down to the engineers, who in turn do not built tools to scaffold or reflect on their search activity. Hodgson (2001) writes that “Workers have always possessed some tacit and other skills beyond the reach of managerial comprehension” [p. 193]. But technocratization of search is also not developed by the workers themselves, workers well capable of developing logs of searches for their own reference or building tools to further scaffold their interactions with the search engines.

Informating and imperative accumulation by web search

Web search engines are no stranger to data collection and application of such data to further their goals. Early researchers looked at how to use encoded-links to aid in the automation of the process of finding and accessing distributed content, leading to hyperlinks and the web itself. Then web search engines informated from that structure. At the first level, we can imagine three primary parties involved in web search. The searcher, the search engine, and those producing content to be searched for. Organizations of the latter two types have heavily sought control and profit through informating points where they interface with searching (with the lattermost using search analytics to change their behavior to increase the quantity or quality of web site visits). In the coding work under examination here there is a fourth party, the coding firm, that intentionally or not exhibits control around the searching practice (from the reasons to search to the reception of the results-of-search) of the searching coders. These lenses provide a way to explore workplace web search, to explore the pressures shaping the context of searching, the search engines, and the searched for content.

The data generated, or informated, as byproducts of the web search of data engineers can largely be seen put to use by providers of search engines and websites, with limited tangential use by individuals, their organizations, or shared in larger communities. Various researchers have identified that technology or routines may be shaped by organizational pressures such that information is produced to allow for better control of the work processes ( Beniger (1986) , Zuboff (1988) and (2015) ) or in the belief that such information will prove valuable (the “data imperative” in Fourcade & Healy (2017) ).

The data generated as byproducts of the web search of data engineers can largely be seen, besides in the continued development and maintenance of the systems that support web search, in the research and activities of the major search engines (i.e. research from Microsoft query log analysis which may perhaps “provides a pulse of what software engineers are searching for and what problems they face” ( Bansal et al. (2019) ) and Google’s “Foobar”). It is also used by builders of websites (learning from the searchers, clickers, and lurkers Antin & Cheshire (2010) ). (Three interviewees noted the use of such website analytics. Two interviewees described examples of using analytics to guide marketing or documentation—one a former data engineer-turned-evangelist for an open source tool for data engineers, the other also involved in developing tools for use by data engineers. One data engineer I interviewed formerly worked in search engine optimization and also attested to the value to websites in the search logs.) But I did not find any examples of the data engineers or their employers using search data for informating.¹⁰⁵ .

Gaps

I looked for bridging between technology gaps (Bailey et al., 2010, Bailey & Leonardi, 2015) or gaps between two technologies. Gap-bridging is where a new technology is introduced to connect a gap between two other technologies that previously the knowledge worker had to traverse manually. A simple example is if the result of a calculation from one piece of software has to be manually typed into another because they were not designed to interoperate. An engineer may then consider whether it is cost-effective to write new code to traverse that gap automatically, or at least without manually clicking and typing.

I imagined I might find examples of people who had written tools to transform an error message directly into a Google search. While such tools, or prototypes, exist, I did not find anyone building or using them. Or perhaps a tool to directly move from notes to a search¹⁰⁶ , or organizing queries from a search session. While many reported taking notes, they did not report integrated notes with the inputs or outputs of their search activity.

Interviewees also did not discuss monitoring or scaffolding, for example guardrails to structure or hints or timers, efforts to improve search.¹⁰⁷ The scaffolding discussed in Extending searching was developed to support the data engineering work as a whole, and it only incidentally embeds knowledge for successful searching. Those scaffolds were not an intentional application to influence search practices and I found no intentional supports for web searching.

One interviewee talked about how integrated development environments (IDEs), with the ability to provide reference information within the display, replaced some web searching. But the ability to reference documentation in the worker’s coding environment without web search has been around since before web search. The use of man pages for documentation reference goes back to the 1970s (Dzonsons, 2011).

People have long been used to fill in gaps between technologies, and perhaps different people than those who would have been previously running a routine. This is discussed in work on heteromation (Ekbia & Nardi, 2014, 2017) and computational labor (Shestakofsky, 2017). I did not find individuals used to fill gaps in searching outside of that discussed in Repairing searching .

One interviewee¹⁰⁸ invited me to speak at a small workshop at their company. They shared with me that they’d spoken with their manager about our conversation and that it had already provided an opportunity to talking explicitly about expectations in searching and asking questions. A month later I presented initial findings in a small teleconference with members of the company. Following up with this individual I learned the earlier conversations and the presentation had spurred the use of a Slack channel for sharing about things different team members had recently learned or looked up.

we fired up a channel that was a “no stupid question” sort of thing, and I think people have used that a lot more, like that’s been clear that people use them, and that was kind of one of the like action items we proposed after that was post questions here, or even if you search something and you find out something dumb, we’ve also done like a lot more like today I learned sort of reflections, which I think has been helpful, we didn’t used to really do that so much.

[ . . . ]

I don’t know how everyone’s individual searches and what not have gone, but I can see that one of the things that we definitely did was the channels and those are way more active. There were five people in it before but no one was using them, and now it’s pretty much our entire data group will post things, or when they like just do things wrong, they pretty much posted in there too, so it kind of morphed out of like a, ‘finding answers’ to also just like a more general like humility, I don’t know like, this is a learning environment, I guess, but that was helpful, like that was great, I mean even if that was not like one of the intended aspects.

This is not the sort of technocratization of web search that involves logging search queries or technical tools to shape the searching itself, but it was an intentional application of technique to influence the wider search practices.

Despite the heavy-reliance on web search in data engineering, the heavy demands on the data engineers to learn, quickly and constantly, neither the engineers nor their employers seem to have intentionally applied technique to influence search practices. Remarkably, at least to me, data engineering workplaces are free from the technocratization overtaking many other workplaces.

Discussion: Delegating or foregoing ownership

What is going on? Why is search, so heavily used, so very much admitted into the work practices, still so solitary and secretive? Why does it appear as though technocratization of search is absent?

Initial explanations

Single-user design of web search

Prior work has discussed the single-user design of web search. Morris & Horvitz (2007) developed a prototype for collaborative web search, SearchTogether. They based its design on a survey suggesting that people (Microsoft employees were surveyed) want to collaborate “with friends, relatives, and colleagues when searching the Web” and many already engaged in collaborative searching behaviors (also called “joint searching” and “multi-user searching” by the authors). They noted that—in 2007 at least—“current Web search tools are designed for a single user, working alone.” Morris & Teevan (2009) , reviewing that prototype and two others in a review of collaborative web search, opened by remarking: “Today, Web search is treated as a solitary experience. Web browsers and search engines are typically designed to support a single user, working alone.” Three years later, Morris (2013) wrote, “the features of the primary tools for digital information seeking (web browsers and search engines) continue to reflect a presumption that search is a single-user activity.” But the design of the tool—the “technical functionality”¹⁰⁹ —cannot be definitive, the people I interviewed are sophisticated builders and users of technology. The “latent structural constraints” (Surden, 2007) , from the design of web search, creating some of the privacy in searching, are probably not the strongest determinants keeping data engineers from informating or programmatically interacting with their own searching activity. They recognize technologies as flexible (Leonardi, 2011).

Articulation of web searching as private

The expectations of privacy in web search may play a role. Despite competition and critique, Google has been very successful in articulating the use of its search engine as private. People recognize that their searches are sensitive and should be secured. Major media attention in the wake of the release of the AOL search logs in 2006 likely contributed to such a recognition.

Rugged individualism

A history of “rugged individualism” in software work likely plays a role (Ensmenger, 2015). Or, rather, the still active legacy of the manufactured perceptions of the role and responsibility of the individual during the 1960s and 1970s likely shapes some of the solitary and secretive searching. Ensmenger describes how the work was seen as involving “individual skill”, “individual expertise”, “individual programmers”, and “individual ability”. But it also was not wholly solitary work. Ensmenger also describes masculine competition and comparison as the stage for “display” (p. 59) or presentations of such skill. In their interactions, programmers would engage in “ritualized forms of competition” (p. 62) or otherwise find ways of “establishing dominance within the community hierarchy” (p. 57). This conceptualizing of skill as something owned by an individual and rituals that reinforce associated epistemologies and responsibilities persists in computing work. (Reagle, 2016) discusses the “obligation to know” and the performances of and for status or stature of knowers in his exploration of “geek knowing”.

It may be that desires to perform appropriate individualism and an awareness of competitive comparisons to others, with both shaping norms around shaming, drives some of the dynamics around solitary and secretive searching. But this also is not determinative. It doesn’t explain what exactly activity, knowledge, or skill might be concealed and what would be proudly and ritually revealed. Transparency in searching could potentially be the sort of artificial challenge that Ensmenger suggests provide opportunity for showmanship. Masculine competition could be performed with displays of superior querying or tooling for searching or memorization in lieu of searching. The research from Ensmenger may actual increase the tenor of the question, as he also shows how despite the impressions, computing work is very social (p. 58):

despite the stereotype of the computer person as individualistic and “disinterested in people,” the computer center was a profoundly social space. [ . . . ] they were more than simply working alone, together. In practice, computer centers were abuzz with conversation and other forms of social interaction.

Generalized reciprocity and self-reliance

We can also look at another seemingly core tenet of computing professions: generalized reciprocity. Weber (2004) , discussing open source and quoting from Constant et al. (1996) , wrote (p. 140):

Generalized reciprocity is a firmly established norm of organizational citizenship within this community. Contributing code and helping others is a sign of “positive regard for the social system in which requests for help are embedded,” a manifestation of pro-social behavior observed in other technically oriented settings as well.

The solitary and secretive behavior might seem to stand in great contrast to notions of generalized reciprocity presented as a hallmark of open source and other technical communities. But reciprocity needn’t imply radical openness (Turco, 2016). Search activity may simply not be included in the community norms around what could or should he shared. The gift giving might flourish while never encroaching on cherished conceptions of individual knowledge and backstage privacy for search activity.

The reciprocity in these communities exists in coordination with the rugged individualism. As Weber goes on to write (p. 145):

The popular image of an open source hacker as a lone ranger emphasizes the self-reliant attitude that is certainly present but misses the deep way in which that self-reliance is known to be made possible through its embedding in a community. The belief is that the community empowers the individual to help himself.

We see the same in the Debian open source community chronicled by Coleman (2012). She observes that “hacker sociality alternates between communal populism and individual elitism” (p. 105). Coleman writes (p. 107):

On the other hand, hackers often express a commitment to self-reliance, which can be at times displayed in a quite abrasive and elitist tone. The most famous token of this stance is the short quip “Read the Fucking Manual” (RTFM). It is worth noting that accusations or RTFM replies are rarer than instances of copious sharing. [ . . . ] These two poles of value reflect pervasive features of hacker social and technical production as it unfolds in everyday life. It only takes a few days of following hacker technical discussion to realize that many of their conversations, whether virtual or in person, are astonishingly long question-and-answer sessions. To manage the complexity of the technological landscape, hackers turn to fellow hackers (along with manuals, books, mailing lists, documentation, and search engines) for constant information, guidance, and help.

Coleman writes of the credo or values of openness, transparency, and access within the open source community. Providing content to be found by searches, including answering Stack Overflow questions, is part of this openness.¹¹⁰ But while some of those answers do discuss how to search, the transparency generally seems to not include the search activity itself. Though there is significant reciprocity in Repairing searching and community work in Extending searching , the ostensible moments of searching seem to be treated differently. The activity at the search box and on the SERP seemed to be that portion of the work that people are expected to do on their own (and that people may interpret as protected—through mutually enacted secrecy (Seaver, 2017., p.5) —to do on their own, in the backstage (Goffman, 1956).¹¹¹

Learning in organizations and in privacy

We can pull these all together—and address solitary and secretive searching and the absence of technocratization of search—with support from literature from organizational learning and privacy for learning. A body of research shows that learning (including organizational learning) may be facilitated by privacy (Bernstein, 2012). If you are expected to do a lot of learning, you might pursue a zone of privacy to avoid exposing what you do not know or to avoid being disrupted in your learning efforts.

Data engineers rely on web search because neither they nor their larger organization knows all they need to know. Web search is a tool for pursuing answers to definite questions and reliance on it is a strategy that affords significant flexibility—in so far as the extensions of search are well-aligned with the search problems in question. Thompson (1967) writes that “Uncertainty appears as the fundamental problem for complex organizations, and coping with uncertainty, as the essence of the administrative process” (p. 159). Part of that coping is through bounded rationality through their structures, “the bounding of rationality requires structural decentralization, the creation of semiautonomous subsystems.” (p. 54; p. 161). Organizations pursue “dual searches for certainty and flexibility” (p. 150). At the higher levels of the organization, at the time of his writing, flexibility and slack was the focus. At the lower decentralized levels at the technical core, it was certainty.

Subsequent work, looking at the constant change and competition in 2000s era web design and new media, suggests the decentralization extends to the level of individuals—with flexibility also being pushed lower. Girard & Stark (2002) write of search as “generalized and distributed”. Kotamraju (2002) writes of the expectation that the “flexible reinvented worker” will be constantly “keeping up”. Neff (2012) writes how “work itself has been largely individualized” (p. 14) and how “individuals now bear most of the costs of flexibility and are responsible for activities previously thought of as within the purview of companies” (p. 14). Neff writes (p. 18):

Considering the quick turnaround times on the development of computer applications, employees are expected, in the words of one programmer, to “hit the ground running” with continually updated skills, including new programming languages and familiarity with new technologies.

The organizational approaches to (and shaping) the dynamism of 2000s era web design and new media may be compared to the organization of work in data science, machine learning, and data engineering today. Avnoon (2021) writes of data scientists “intensive self-learning” to “keep[] abreast” and “the emotional stress caused by continuously attempting to keep pace with technological and theoretical developments.” Data engineers I interviewed also remark on there being too much to know, constant change in tools and practices being stressful, and flailing around frustrated and desperate in the search bar.

My argument is that the solitariness and secrecy of search is a consequence of this organizational—and professional—response to uncertainty made within the context of the preceding norms of (1) searches as private, of (2) knowledge as something ‘owned’ and performed for status, and of (3) the boundaries of reciprocity. Perhaps data engineers do not grant that their searching might be technocratized, informated or automated, because of the challenge is would pose to their identity and ways of knowing. Perhaps they do not publicize and compete in, or share and share in, their searching because they need theirsolitary searching to remain private so they might present the appearance of rugged individualism. Transparency (and concomitant competition in a masculine culture) might shatter the illusion of individualism, meritocracy, genius¹¹² , and challenge the systems that favor them.

Technocratized or collaborative and transparent searching might risk removing artificial barriers to the fields. As it is, the extensive solitary searching and demand for self-reliance may be an artificial barrier. Ensmenger (2015) argued that the “whiz kid” or “computer boy” identity (p. 65):

provided programmers with many of the perceived benefits of professionalization: the establishment of barriers to entry to the discipline, the possession of a “monopoly of competence,” and mastery over an esoteric body of knowledge. [internal footnotes omitted]

Various approaches to change may require a shift to explicitly viewing requests for help as a chance to connect and learn (Perlow & Weeks, 2002) rather than an interruption¹¹³ and a sign of individual weakness. Technocratized or collaborative and transparent searching may also require addressing cultures of shame (or they could risk slowing learning as people hide or avoid searching). Turco (2016) demonstrates how surveillance can be imagined by the surveilled to be a form of access to managers a benefit (see also Stark & Levy (2018) ).

Changing the technology as a way of changing the culture may be a “punctuating force” and could “disrupt an established social structure” (Leonardi, 2007) and “forced representation of work” (Star & Strauss, 2004) may have unanticipated consequences. The decision makers within the field may resist changes because they benefit from the flexibility that structurelessness provides (Freeman, 2013).

Conclusion: Uncertainty sink

Individuals are identified as responsible for their searching. This contributes to solitary and secretive searching. People still participate in the larger shared work practices, shown in Extending searching , and to a lesser degree newcomers may gradually participate to varying degrees in fixing failed searches, shown in Extending searching . This participation introduces new data engineers to the ways of working and searching as a data engineer. For many data engineers that is enough, they are able to jockey and save face as they ask questions to repair failed searches, and make friends or otherwise develop strong rapport with more experienced coworkers. Those relationships allow them to ask questions outside of the search repair channels and not suffer as much when their public questions are out of place. But others stay on the periphery, already marginalized in the larger community, they are judged more harshly for their lack of knowledge and made more responsible for searching on their own. How responsibility for searching is positioned creates this cyclic loop that keeps penalizing those on the outside and the responsibility for is distributed to individual searchers,

The status quo for the data engineers is devoid of technocratization (intentional application of technique to influence search activity). While searching is a collaborative endeavor, the data engineers’ searches are solitary and secret. While some privacy for searches facilitates individual learning, the current balance of secrecy may limit organizational learning and limit effective inclusion in data engineering. Data engineers search “on their own” and conceal their learning work. Individual searchers act as an “uncertainty sink”¹¹⁴ , allowing the organization to act nimbly, assigning to the individual “flexible reinvented worker” (Kotamraju, 2002) the responsibility to maintain and improve their skills. But that responsibility is not governed or managed by the organization.

Bibliography

Antin, J., & Cheshire, C. (2010). Readers are not free-riders: Reading as a form of participation on wikipedia.Proceedings of the 2010 Acm Conference on Computer Supported Cooperative Work, 127–130. https://doi.org/10.1145/1718918.1718942 [antin2010readers]

Avnoon, N. (2021). Data scientists’ identity work: Omnivorous symbolic boundaries in skills acquisition.Work, Employment and Society,0 (0), 0950017020977306. https://doi.org/10.1177/0950017020977306 [avnoon2021data]

Bailey, D. E., & Leonardi, P. M. (2015).Technology choices: Why occupations differ in their embrace of new technology. MIT Press. http://www.jstor.org/stable/j.ctt17kk9d4 [bailey2015technology]

Bailey, D. E., Leonardi, P. M., & Chong, J. (2010). Minding the gaps: Understanding technology interdependence and coordination in knowledge work.Organization Science,21 (3), 713–730. https://doi.org/10.1287/orsc.1090.0473 [bailey2010minding]

Bansal, C., Zimmermann, T., Awadallah, A. H., & Nagappan, N. (2019). The usage of web search for software engineering.arXiv Preprint arXiv:1912.09519. [bansal2019usage]

Beniger, J. (1986).The control revolution: Technological and economic origins of the information society. Harvard university press. [beniger1986control]

Bernstein, E. S. (2012). The transparency paradox: A role for privacy in organizational learning and operational control.Administrative Science Quarterly,57 (2), 181–216. https://doi.org/10.1177/0001839212453028 [bernstein2012transparency]

Beunza, D., & Stark, D. (2012). From dissonance to resonance: Cognitive interdependence in quantitative finance.Economy and Society,41 (3), 383–417. https://doi.org/10.1080/03085147.2011.638155 [beunza2012dissonance]

Christin, A. (2017). Algorithms in practice: Comparing web journalism and criminal justice.Big Data & Society,4 (2), 1–12. https://doi.org/10.1177/2053951717718855 [christin2017algorithms]

Coleman, E. G. (2012).Coding freedom: The ethics and aesthetics of hacking. Princeton University Press. https://gabriellacoleman.org/Coleman-Coding-Freedom.pdf [coleman2012coding]

Constant, D., Sproull, L., & Kiesler, S. (1996). The kindness of strangers: The usefulness of electronic weak ties for technical advice.Organization Science,7 (2), 119–135. [constant1996kindness]

Dreyfus, H. L., & Dreyfus, S. E. (2005). Peripheral vision: Expertise in real world contexts.Organization Studies,26 (5), 779–792. [dreyfus2005peripheral]

Dzonsons, K. (2011).History of unix manpages. https://manpages.bsd.lv/history.html . [dzonsons2011history]

Ekbia, H., & Nardi, B. (2014). Heteromation and its (dis) contents: The invisible division of labor between humans and machines.First Monday. https://doi.org/https://doi.org/10.5210/FM.V19I6.5331 [ekbia2014heteromation]

Ekbia, H. R., & Nardi, B. A. (2017).Heteromation, and other stories of computing and capitalism. MIT Press. [ekbia2017heteromation]

Ensmenger, N. (2015). “Beards, sandals, and other signs of rugged individualism”: Masculine culture within the computing professions.Osiris,30 (1), 38–65. http://www.jstor.org/stable/10.1086/682955 [ensmenger2015beards]

Fourcade, M., & Healy, K. (2017). Seeing like a market.Socio-Economic Review,15 (1), 9–29. https://doi.org/https://doi.org/10.1093/ser/mww033 [fourcade2017seeing]

Freeman, J. (2013). The tyranny of structurelessness.WSQ,41 (3-4), 231–246. https://doi.org/10.1353/wsq.2013.0072 [freeman2013tyranny]

Girard, M., & Stark, D. (2002). Distributing intelligence and organizing diversity in new-media projects.Environment and Planning A,34 (11), 1927–1949. [girard2002distributing]

Goffman, E. (1956).The presentation of self in everyday life. University of Edinburgh. [goffman1956presentation]

Hodgson, G. M. (2001).How economics forgot history: The problem of historical specificity in social science. Routledge. [hodgson2001economics]

Kotamraju, N. P. (2002). Keeping up: Web design skill and the reinvented worker.Information, Communication & Society,5 (1), 1–26. https://doi.org/10.1080/13691180110117631 [kotamraju2002keeping]

Leonardi, P. M. (2007). Activating the informational capabilities of information technology for organizational change.Organization Science,18 (5), 813–831. https://doi.org/10.1287/orsc.1070.0284 [leonardi2007activating]

Leonardi, P. M. (2011). When flexible routines meet flexible technologies: Affordance, constraint, and the imbrication of human and material agencies.MIS Quarterly,35 (1), 147–167. http://www.jstor.org/stable/23043493 [leonardi2011flexible]

Lianos, M. (2010). Periopticon: Control beyond freedom and coercion and two possible advancements in the social sciences. In K. D. Haggerty & M. Samatas (Eds.),Surveillance and democracy (pp. 69–88). https://www.taylorfrancis.com/chapters/edit/10.4324/9780203852156-11/periopticon-control-beyond-freedom-coercion-two-possible-advancements-social-sciences-michalis-lianos [lianos2010periopticon]

Morris, M. R. (2013). Collaborative search revisited.Proceedings of Cscw 2013. https://doi.org/10.1145/2441776.2441910 [morris2013collaborative]

Morris, M. R., & Horvitz, E. (2007). SearchTogether: An interface for collaborative web search.Proceedings of the 20th Annual Acm Symposium on User Interface Software and Technology, 3–12. [morris2007searchtogether]

Morris, M. R., & Teevan, J. (2009). Collaborative web search: Who, what, where, when, and why.Synthesis Lectures on Information Concepts, Retrieval, and Services,1 (1), 1–99. https://doi.org/10.2200/S00230ED1V01Y200912ICR014 [morris2010collaborative]

Neff, G. (2012).Venture labor: Work and the burden of risk in innovative industries. MIT press. https://mitpress.mit.edu/books/venture-labor [neff2012venture]

Nissenbaum, H. (2011b). From preemption to circumvention: If technology regulates, why do we need regulation (and vice versa).Berkeley Tech. LJ,26, 1367. [nissenbaum2011preemption]

Orr, J. E. (1996).Talking about machines: An ethnography of a modern job. ILR Press. [orr1996talking]

Perlow, L., & Weeks, J. (2002). Who’s helping whom? Layers of culture and workplace behavior.Journal of Organizational Behavior,23 (4), 345–361. https://doi.org/https://doi.org/10.1002/job.150 [perlow2002helping]

Reagle, J. (2016). The obligation to know: From faq to feminism 101.New Media &Amp; Society,18 (5), 691–707. https://doi.org/10.1177/1461444814545840 [reagle2014obligation]

Seaver, N. (2017). Algorithms as culture: Some tactics for the ethnography of algorithmic systems.Big Data & Society,4 (2), 1–12. https://doi.org/10.1177/2053951717738104 [seaver2017algorithms]

Shestakofsky, B. (2017). Working algorithms: Software automation and the future of work.Work and Occupations,44 (4), 376–423. https://doi.org/10.1177/0730888417726119 [shestakofsky2017working]

Star, S. L., & Strauss, A. (2004). Layers of silence, arenas of voice: The ecology of visible and invisible work.Computer Supported Cooperative Work (CSCW),8, 9–30. [star1999layers]

Stark, L., & Levy, K. (2018). The surveillant consumer.Media, Culture & Society,40 (8), 1202–1220. https://doi.org/10.1177/0163443718781985 [stark2018surveillant]

Surden, H. (2007). Structural rights in privacy.SMU Law Review,60 (4), 1605. https://scholar.law.colorado.edu/articles/346/ [surden2007structural]

Thompson, J. D. (1967).Organizations in action: Social science bases of administrative theory (1st ed.). McGraw-Hill. [thompson1967organizations]

Turco, C. J. (2016).The conversational firm: Rethinking bureaucracy in the age of social media. Columbia University Press. [turco2016conversational]

Weber, S. (2004).The success of open source. Harvard University Press. https://www.hup.harvard.edu/catalog.php?isbn=9780674018587 [weber2004success]

Zuboff, S. (1988).In the age of the smart machine. Basic books. [zuboff1988age]

Zuboff, S. (2015). Big other: Surveillance capitalism and the prospects of an information civilization.Journal of Information Technology,30 (1), 75–89. https://doi.org/10.1057/jit.2015.5 [zuboff2015big]

This interview excerpt is also in Admitting searching . ↩︎
This search work on one’s own is related to the ‘due diligence’ that research participants described doing before asking colleagues for help (discussed in due diligence and packaging questions ). You can turn to colleagues for help, generally after assuming responsibility for searching, and they may provide help but they are unlikely to assist in the search box or on the SERP. The web search activity consisting of interacting directly with the search engine is generally the responsibility of the solitary data engineer. ↩︎
See in Admitting searching : search talk? . ↩︎
An interest in speedy searching , next section. ↩︎
This interview excerpt is also in Running workable code . ↩︎
This last line from Shreyan may be read to suggest an explanation or justification for the secrecy around search and its distributed nature. There may be a desire to avoid ‘resonance’, a term to describe something akin to groupthink perhaps, that Beunza & Stark (2012) use to describe “a dangerous form of cognitive interdependence” where productive dissonance is disrupted by the lack of diversity in approaches. Distinctly different, when I clarified with Shreyan later in the interview, he referred to research on brainstorming that suggested it was most effective when people were able to generate ideas on their own rather than in a group, before critiquing them. ↩︎
A fuller treatment would engage more deeply with writings on the future of work, surveillance studies, and perhaps the quantified self-movement. Lianos (2010) , for instance, could be read to suggest that the logic of what to do with accumulated data is not clear, writing “accumulated data do not necessarily amount to a plan and even less so to a totalitarian plan” (p. 72). ↩︎
Except insofar as tooling or structure is added to the process of asking questions of colleagues, discussed in Repairing searching . ↩︎
For instance, I wrote in Python a simple a clunky plugin for myself for the Sublime text editor that takes a simple notation (the search tool code followed by the query in square brackets) and with a hotkey opens a browser tab with the search. So I can write g[search this] or bmail[search this] in my notes and my hotkey (i.e. Cmd+l+o) will open a tab in my browser to the Google or Berkeley Gmail, respectively, search for [search this]. This sacrifices some features of searching in the address bar or in the search box. For example, I do not see suggested searches or auto-completed phrases from the search engine. ↩︎
Such monitoring or scaffolding does not fall under the rubric of gap-bridging discussed by Bailey & Leonardi (2015) , but could provide support for navigating gaps. ↩︎
I am not using the pseudonym of the interviewee to reduce the likelihood of identification. ↩︎
Nissenbaum (2011b) wrote of Adnostic, a system for privacy preserving targeted advertising:

In the effort to gain a toehold for Adnostic, technical functionality is not the greatest barrier. We have found ourselves up against a cultural mythology of innovation, incredibly powerful in the context of the Internet and web.

↩︎
Very few of my research participants made mention of providing content on forums such at Stack Overflow, whether asking questions or answering. I do not venture an explanation for this lack of reciprocity, though note that research like that from Antin & Cheshire (2010) indicate readers and lurkers still provide significant value to online communities. So these individuals can still be seen to contribute to their broader community through their website visits. ↩︎
Goffman (1956, p. 70) : “the back region will be the place where the performer can reliably expect that no member of the audience will intrude.” ↩︎
Perhaps the solitary and secretive searching preserves the appearance of genius, the esoteric. Ensmenger (2015) tells of John Backus, developer of FORTRAN, critically calling programming of the 1950s “a black art, a private arcane matter”:

While Backus did not intend this description to be complimentary—as an aspiring computer scientist he saw this reliance on individual ability and local knowledge to be demeaning—many other programmers saw this emphasis on personal creativity and esoteric skill as the source of their professional authority. To be a devotee of a dark art, a high priest, or a sorcerer (all popular metaphors used to describe programming in this period) was to be privileged, elite, master of one’s own domain. It was certainly preferable to being characterized as a glorified clerical worker or a “mere” technician. [internal footnote omitted]

↩︎
Orr (1996) writes to the systemic approaches of the copy repair technicians making their work “interruptible”: “Being systematic has the advantage of being interruptible” (p. 145). ↩︎
Like a heat sink (or heatsink), which moves heat from a heat generating component to a medium that dissipates heat, some business uncertainty is addressed by the delegation of searching to individuals. ↩︎