This is a chapter of a published dissertation: Situating Web Searching in Data Engineering: Admissions, Extensions, Repairs, and Ownership.
Now scientists everywhere use the air pump, say, or the electrophoresis gel without thinking about it. They look through the instrument the way one looks through a telescope, without getting caught up in battles already won over whether and how it does the job. The instrument and all of its supporting protocols (norms about how and where one uses it, but also standards like units of measure) have become self-evident as the result of social processes that attend both laboratory practice and scientific publication. (Gitelman, 2006, p. 5)
Sometimes I just wonder, like, who taught them how to search? (Victor)
“To be frank I’ve like really never thought about it myself even though it’s kind of like 90% of my job to just like look up things.” That is what Amar told me at the start of our interview. To a certain extent the data engineers use web search without thinking about it. “It is kinda like breathing” (Phillip), “something that people maybe take for granted” (Lauren). So how then do they learn to search at work?
There is very little explicit instruction on web search practices in the data engineering workplace. Despite it constituting a significant portion of their work, not only are data engineers not taught how to search the web, they are also not evaluated directly on their search performance. While there are a range of onboarding processing and mentorship models, generally only in the earliest stages of their careers are new data engineers offered any direct advice about how to search or told how more experienced data engineers do so. Even at this early stage, advice and insight is sparse. Furthermore, data engineers rarely discuss their search queries, search result evaluation processes, or how they reformulate queries or follow threads in pursuit of an answer, what I call “search talk”.
There is little opportunity for data engineers to directly observe or participate in other data engineers’ searching. Its form–a small box on a terminal designed for individual use–affirms search as a solo act. In the absence of formal training, limited professional discussion, and a form factor that limits observation, one might predict that web search for data engineering, like learning to program a VCR, may be difficult to learn, as compared to “a fundamentally social practice” like learning to drive a car (Brown & Duguid, 1996, p. 51).
The analytic of legitimate peripheral participation (LPP), however, helps identify where data engineers are provided opportunities to participate and learn what it means to effectively use web search as a data engineer. Modifying Beane (2019) ’s concept of shadow learning, “a set of practices in the shadows outside the legitimate peripheral participation typical of the literature on communities of practice” (p. 91) I locate participation and legitimacy in how search is admitted: “search confessions”—the self-deprecating or hyperbolic remarks data engineers make about their extensive reliance on web search and their web searching practices (the topic of this chapter) and the occupational, professional, and technical forces that explicitly and implicitly structure search practices (discussed in chapters 4 and 5).
In the absence of formal training or apprenticeship, “search talk”, or even visibility into the successful search practices of other data engineers, data engineers collectively wrestle with and affirm the appropriateness of their reliance on web search through “search confessions”. At face value “search confessions” appear to be jocular, off-the-cuff jabs at the profession’s reliance on search. However, in practice they affirm reliance on search—acting as informal search approbations. In conjunction with “search confessions”, the absence of “search talk” further affirms the implicit acceptance of such heavy reliance on web search while also marking searching practices as private and generally and appropriately free from remark and appropriately protected from direct scrutiny. Search confessions are a site of legitimate peripheral participation, by exposing new data engineers to the constant process through which reliance on search and norms about its use are constantly negotiated, re-made and affirmed. While data engineers do not directly engage each other in the moment of searching, their web searching is informed by this confessional talk about search. Rather than being directly taught how to form and reform queries or how to evaluate and course correct, I find that through confessions about and around search and silences about exactly how to do it (“search talk”) data engineers learn how to search.
The reliance on search confessions to normalize the use of search and, to some extent, train and educate data engineers about effective and legitimate use of web search in work practice comes at a cost. It presents barriers to those marginalized in technology work today (discussed in chapter 6).
The next section looks closer at the LPP literature, focusing on learning in the shadows. This is followed by a presentation of my empirical findings and analysis. Then I discuss implications for our understanding of LPP and re-situate this chapter within the dissertation.
Lave & Wenger (1991) claim that: “Learning viewed as situated activity has as its central defining characteristic a process that we calllegitimate peripheral participation”44 (p. 29). This concept is the outgrowth of their desire to write about apprenticeship and their phrasing highlights that learners “inevitably participate in communities of practitioners” and success in learning requires learners to “move toward full participation” (p. 29).
They propose the concept of legitimate peripheral participation to describe “engagement in social practice that entails learning as an integral constituent” (p. 35) The “central preoccupation” of their book “is to translate this into a specific analytic approach to learning” (p. 35). Cautioning against decomposing the concept into three components, they write that it is “to be taken as a whole.” (p. 35) That is, there is not an illegitimate peripheral participant that learns and so challenges the theory, but rather the sort of legitimacy of participation will shape what is learned. Similarly, peripherality is about the “ways of being located in the fields of participation defined by a community” (p. 36). They suggest these “ambiguous potentialities” provide “access to a nexus of relations otherwise not perceived as connected” (p. 36) offering a new and distinct “analytical perspective”.
Over the last 30 years LPP has been widely applied. John Seely Brown and Paul Duguid popularized it, also in 1991 (see Contu & Willmott (2003) , p. 283). Other work on LPP includes Brown & Duguid (1991) , Orr (1996) , Brown & Duguid (1996) , Brown & Duguid (2001) , Contu & Willmott (2003) , Bechky (2006b) (referring to both Bechky (2003) and Bechky (2006a) ), Duguid (2008) , Takhteyev (2012) , and Gasson & Purcelle (2018).
The LPP analytic lens helps direct attention to interactions between data engineers, to learning opportunities and participation. While direct participation in the moment of searching is rare (the data engineers rarely “pair search”, even though some may pair program), participation in data engineering work practices provides opportunities for participation in the larger search practices. Data engineers do not have formal training in search. They do not collaborate at the search box or on the SERP. So I considered talk.
Talk is a key element of participation. Partly through talk, stories and jokes, people construct shared understandings of the work and their identity. This is seen in Orr (1996). Orr finds that “[n]arrative forms a primary element” of the practice of photocopy repair technicians [p. 2]. Talk is “instrumental”, stories and conversations circulate knowledge of machines, customers, and the task of diagnosing and fixing problems. It also shapes identity, the technicians “tell tales to establish their membership in the community” [p. 142].
First, data engineers say they don’t talk about it.
There is limited explicit instruction. In the interview with Ross, after we talked about the various sorts of places he would search at work, he said the following in talking about on-boarding a new hire:
I probably had a brief conversation with them. That was, you know, five sentences that summarized what we’ve already talked about. ’You go to the web for this kind of stuff. Go to the wiki for this kind of stuff, and Slack for this kind of stuff.
Amar had recently started on-boarding for a new job after several years at a previous company. He had been successful there, rising to a technical leadership role within his team.
Midway through the interview I asked Amar: “Are you talking with your team about the searches you’re doing? When they join your team are you saying: ‘Here’s my process for searching. This is what you should do.’ Or?”
Um, I think, that’s an interesting question. [pausing and proceeding slowly at first] I don’t believe I’ve ever done that except for… except with one engineer and the reason why I did that with that engineer is that was an intern and they were not very— That person was an intern who joined [the company] full-time but they didn’t have a lot of professional work experience outside of internships.
So they were a fresh grad, fresh out of undergrad. For them, because they didn’t have a dedicated process—and it wasn’t me going out of the way, because I’d never prescribe that this is how you should do it—But they kind of were ‘Hey, every time I have a problem I have to like, like do a couple of Google searches and if I can’t find anything I have to come to you and then you, even if you don’t have an answer immediately you pretty much find it, find resources pretty quickly, how? So what do you do?’
And then that’s when, that’s the only time I kinda said, hey this is my process and this works for me [emphasis from interviewee] but outside of that engineers don’t really—at least in my experience or at least within my team—will not explicitly discuss their process.
Following up, saying:
I don’t think I’ve actually talked with the team, but maybe I should, as like a personal note. [ . . . ] it is just part of the job that usually not very apparent unless its like very very inexperienced engineers.
Pair programming (working together on the same code at the same time either next to each other or remotely) might be a place where searching is talked about. While the data engineers I talked to generally did not practice pair programming45 , those that did generally reported searches being hidden and not discussed.
Christina said there is less pair programming in data engineering than elsewhere, but said, “If I were pair programming with someone and I was sharing my screen I would have a tendencies to pull up my search on a separate screen.”
Likewise, Megan said:
I’ll notice people turn off sharing when they switch to searching, and then they’ll find something and turn sharing back on. …a lot of stuff people are really interested in collaborating on, but search is very private. It is something you go do and then you come back and share the results of your search…
Ross said that if he were doing a screenshare with a colleague on one of his screens and had to look something up he’d open a new tab in the other screen, not shared, and do his search, saying “I don’t think I’m the only one like that.”
Second, there is asubmerging of web search itself in talk nominally about web search. Even interactions that interviewees would describe as being about searching the web were not directly so.
Here is a response from Sameer, as we were talking about mentoring interns or new college graduates, when I asked him for an example of “politely suggesting googling”:
I’ve realized that schools, depending on which program you go to, computer science majors, have a lot of theoretical knowledge. Graduates will have knowledge about distributed systems, algorithms, data structures, but then actually coming to a company and writing code is different. So there’s a lot of guidance and mentorship around that. And obviously if the intern or new college graduate does not have experience in industry then sometimes I do think we need to, politely, point them to search—‘Hey, have you tried googling it? Because it seems like a very simple thing you can find yourself.’
But I think sometimes people, when they are stuck in that rabbit hole it is a very thick forest. When you are googling things you can hit one web page and be like ‘oh, I don’t know what this means’ and then go to a second line and “oh I don’t know this is either!” and suddenly you’re learning about quantum physics, right? [laughter] So, so, very far away from what you started out with. So you kinda need to understand what to Google, where to stop, and where to just ask someone for help.
I asked a follow-up: Can you recall any of those conversations or times when you politely suggested googling?
So the way I do this is by, one of the easiest ways to do this, is just to send someone a Stack Overflow link and go, ‘oh hey, someone already answered this question. Here you go.’ And I hope they read between the lines, ‘I should be googling this.’
If I don’t send a Stack Overflow link and I’m just solving the problem for them then I will definitely have a one-on-one and have a conversation, ‘OK, this is how you should be solving it.’ I totally get that this [answer to the question asked] is not common knowledge. But I hope that the intern or new college graduate can read between the lines. I don’t want to have a conversation with anyone saying ‘did you try googling this yet?’ It’s not very polite, I feel.
And most people pick it up. It is very rare that someone would bother me with something easily queryable again and again.
Sameer presented this story of sharing a link (that he had found by searching) as an example of politely suggesting googling. He made no mention of googling or web search in his description of what he explicitly communicated to his colleague. The use of the web search tool itself is kept below the surface.
The coworkers in this example avoid mention of web search. This sort “tactful inattention” (Goffman, 1956, p. 147) reproduces boundaries or norms around whether or how the tool is mentioned itself.
Third, there is speculation about how junior engineers learned to search. This demonstrates further the lack of formal instruction on search.
Victor, one of the data engineers who reported their team regularly pair programmed, described working with a junior colleague:
One thing that I find like when I pair program with more junior engineers is the way they do the query searches is very different than I would approach it. So, they’ll just ask me, they’re like: ‘Um, how do I run this command in docker.’ It’s like, I don’t know. What— Like let’s— I don’t know. Like do you think I just memorize it? No. Let’s, let’s Google it. And their search: [docker [command]]
Do you think that search is going to get the answer you want? How does that even happen? ‘Well, what would you search?’ Let’s just repeat the question you asked me and type that into Google. Sometimes I just wonder, like, who taught them how to search? I remember being in like 5th or 6th grade and having a class about how to do web searches. I guess that’s something not everyone does anymore
This speculation highlight the lack of visible formal search education in the workplace—made visible here in a collision of generational perspectives, perhaps. Sundin (2020) studied how older youth in Sweden use general-purpose search engines. He found that “search engines almost never seem to have been a visible information infrastructure for the current generation of teenagers” (Sundin, 2020, p. 378).
More broadly, these comments and stories from interviewees point to how search is multiply and complexly hidden. Search talk is absent partially because search has, for many interviewees, become infrastructure—habit and routine, “like breathing.” It is also avoided because of the sensitivity developed from what it might reveal or suggest about one’s own or another’s knowledge, or lack thereof (the secrecy covered in the final analytical chapter, Owning searching ). But, it may also be a tacit “action-centered skill” (Zuboff, 1988, p. 186) , knowledge or knowing that “cannot be put into words” (Polanyi, 1967, p. 4).46 Zuboff (1988) writes of how a richly textured tacit skill, is “deeply embedded in crisscrossing relationships, and too continuous to be captured in a verbal description” (p. 187). It is likely, Zuboff writes, that “attempts at explication of such tacit knowledge must always be incomplete. The knowledge is too layered and subtle to be fully articulated. That is why action-centered skill has always been learned through experience (on-the-job-training, apprenticeships, sports practice, and so forth)” (p. 188). Even if it were that explication of tacit knowledge about search activity is always incomplete and could not be fully articulated, that does not mean explication couldn’t be useful and wouldn’t be tried. But the difficulty of communicating tacit knowledge may multiply the sensitivity in talking about searching. If attempts to discuss it themselves seem to indicate some lack of self-awareness or inability to communicate. But in this case it isn’t just that the data engineers don’tverbally share their search activity, they don’tmaterially orexperientially share their search activity either. The data engineers do not join together in the action in the search bar or on the search results page.
The search activity itself seems relegated to (or reserved for) the backstage. As Goffman (1956) writes (p. 69):
A back region or backstage may be defined as a place, relative to a given performance, where the impression fostered by the performance is knowingly contradicted as a matter of course. There are, of course, many characteristic functions of such places. It is here that the capacity of a performance to express something beyond itself may be painstakingly fabricated; it is here that illusions and impressions are openly constructed.
Certain impressions of an individual’s knowledge or knowing, impressions necessary for data engineers to engage as experts in workplaces that conceive of knowledge as something possessed by and the responsibility of individuals, are developed in a protected backstage.
But they do talkabout search —spotting examples of such talk led to my interest in studying this site. Even though they do not engage in “search talk”—discussions about what they input into a search box or how to parse the results pages—they talk about search. They discuss searching and the motivations behind searches although the searches themselves go unmentioned or are only subtly implied. This talk about search shapes their understanding of acceptable use of search. Here I will focus on confessions data engineers make about their use of web search. Repairing searching , the fifth chapter, discusses another space where data engineers talk about search, sharing about and fixing failed searches.
Data engineers profess to making extensive use of web search at work. I expected such professing given my initial experience leading to this research. I did not expect, though, the forcefulness and apparent hyperbole and overgeneralization in these statements from the data engineers.
At the start of our conversation Amar laughed as he said, as though revealing some embarrassing secret, “it’s kind of like 90% of my job to just look things up.” Christina chuckled, saying “probably 90% of my job is Googling things.” Ross, likewise, said, “it’s a large part of my job.” Over email while scheduling the interview, Vivek wrote: “Web Search is a part of everything I do.” Noah said, “I consider it a core of doing my job.”
I call these search confessions. Search confessions are, often self-deprecating or hyperbolic, statements about one’s reliance on web search. Many of these confessions are delivered as though admitting something somewhat shameful, of something that others may find wrong or weak. Sometimes the confessions accompany a statement that there is nothing to be shameful about. Search confessions are statements that individually admit of a reliance on web searching and collectively admit the practice of searching for work into the work practices of data engineering.
These statements mirrored those that started my research, both the initial tweets and the subsequent blog, forum posts, tweets, and TikToks I later found. A key difference being that the above examples were directed to me and not to a general audience of peers or fellow community members.
Megan, talking of how people will admit to searching but not share the searching itself, shared:
People who are all on board for ‘do your work in public’, ‘show your mistakes’, will still keep hidden the specific process of searching. That is something that they’re not as eager to share. There are a number of people who say searching isn’t bad, we all do it all the time. But it still feels somewhat shameful. I don’t know what that is about. [ . . . ] Um, um, and they’re not, they’re not even like hiding it necessarily. They’re like, they’ll say like, oh, I’m just going to look this up real quick and pop it in, but they still do it in a separate window. See even if they’re not trying to like, deceive about the fact that they’re searching, they don’t want the process of searching to be visible. Which I think is interesting. I don’t know.
Christina is a data engineer working as a consultant helping external clients with her company’s enterprise software tools. I asked about her initial reaction to hearing about the research topic and she said, “That probably 90% of my job is Googling things,” and chuckled. She laughed when I told her that I wanted to know why everyone says 90%. She said, “because it’s the majority, and it’s not just the majority. I wouldn’t go 50%, 60%, its like I can barely think of other things I do.” Then, through laughter: “Meetings. And meetings. But I multitask so even while I’m in a meeting.”
At the close of the interview I asked for any final reflections. Christina said:
It was interesting because I hadn’t connected the way I search day-to-day with our whole company initiative for developer experience. [ . . . ] So it makes a lot of sense, that while I am not often searching, googling, company-specific things, I wish I could but the answers aren’t there.
I guess my 90% I said at the beginning is actually pretty wrong because I probably spend a lot more time asking people questions than asking the internet questions.
In a follow-up member check interview I told Christina how I had identified and described search confessions. She said, “Yeah, yeah. This is completely coincidental,” and went on to share that her IP address (working from home) had recently been blocked by Stack Overflow.
I screenshotted the other day that Stack Overflow blocked my IP address because I sent too many requests. I screenshotted it and sent it to our team Slack channel. So it was kind of an acknowledgement of ‘oh, look, this is kind of embarrassing I’ve asked too many questions.’ My team is three people. We all know we ask a lot of questions, so it’s not shameful.47
At the close of the follow-up with Christina, I apologized for running so long over our planned meeting time and she replied, laughing: “No, that’s OK. I’ve needed a break from searching.”
Comments and jokes about Stack Overflow offer another means of confession. Recall Noah, who said he wasn’t “the greatest googler of all time”, and that if he found out that some of his coworkers were searching the web a lot less than him, he would “actually wonder if they weren’t doing their job as effectively as they could.” When I asked him about Stack Overflow memes he shared that he has a laptop sticker that says “Copying & Pasting from Stack Overflow”. He mentioned the sticker again a year later in our follow-up member check.
Jane, an analyst working with data engineers at a prominent social media company, shared that the IP address at her company was blocked intermittently by Stack Overflow for a couple months. When it was blocked, engineers would gather on a page explicitly for memes and jokes, asking facetiously who was using it too much or joking that it must have been their individual fault for searching too much while trying to fix a bug. She mentioned Stack Overflow being blocked at her company to her friends, telling them not to have imposter syndrome because nobody knows what they are doing.
Jillian shared a story of joking among colleagues about surveillance productivity software:
One time we were joking about these different productivity surveillance tools that some companies use, for working from home environments specifically. They might take a screenshot of what you’re looking at on your monitor.
And I was like, “oh, I would hate that because I’d be working but it would show that I’m like googling ‘what is a computer’ or like something rudimentary.”
And then, but we, everyone on my team was kind of joking about things like that, you know, just like talking about looking up, you know, this page for ‘explain it to a kindergartner’, whatever.
These search confessions serve multiple learning purposes. The confessions ritualistically provide a space for community members to affirm their commitment or conviction to this way of being an expert. Search confessions at once open up web search practices to challenge and create iterative openings for the community to foreclose potential threats to professional identity used by this general purpose tool through collective affirmation. Posted on blogs, forums, or Twitter, search confessions like the “Copying and Pasting from Stack Overflow” sticker on Noah’s laptop or the “I HAVE NO IDEA WHAT I’M DOING” dog meme48 , both mock and validate the data engineering community’s reliance and dependence on web search. Confessions acknowledge and celebrate the difficulty of their work, constantly at the edge of the field or juggling far more (constantly changing) information than might seem possible.
Participating in these confessions is part of the ritual of these search-reliant fields. The rituals introduce and admit new members to the community and facilitate interactions between members. Borrowing from (Goffman, 1956, p. 121) , we can identify search confession is “unofficial communication” that “provides a way in which one [data engineer] can extend a definite but compromising invitation to the other”, through this sort of “‘putting out feelers’”, a “guarded disclosure.” The confessions legitimate web searching and allow data engineers recognize each other’s shared orientation towards search and drop pretense. Goffman goes on to write:
By means of statements that are carefully ambiguous or that have a secret meaning to the initiate, a performer is able to discover, without dropping his defensive stand, whether or not it is safe to dispense with the current definition of the situation. [ . . . ] it is common for colleagues to develop secret signs which seem innocuous to non-colleagues while at the same time they convey to the initiate that he is among his own and can relax the pose he maintains toward the public.
These search confessions are a sort of secret sign. LPP identifies learning not as absorbing facts, but “deploying through practice the resources [ . . . ] available to you to participate in society, a process [ . . . ] inseparable from the development of a social identity” (Duguid, 2008, p. 3).
Confessions are not admissions of deviance, though they acknowledge a felt-deviance. They are ritual acts designed to elicit assurance and renew the shared conviction to the norm of reliance on web search. It affirms that this search work—‘just’ turning to a general-purpose search engine, which seemingly anyone could do—is the work of the field. These confessions thus legitimize search work and the status of the searching worker as a data engineer. They are little openings for the field to reiterate and maybe rearticulate not only what their work is, but also jurisdictional claims.
In the place of search talk we find search confessions. The LPP analytic makes search confessions visible, making it possible to see constituent elements of learning, shared and accepted practice. Though recurrently achieved & reproduced, these confessions are informal. The informality with which this community approaches learning to search presents opportunities and poses challenges.
First, the informality and ambiguity of admitting search through search confessions keeps open space for maneuverability—the uses of web search engines are kept open to be adjusted as changing circumstances may dictate. As a general strategy, web search is used in data engineering in order to manage uncertainty. This is discussed at length in the fourth analytical chapter, Owning searching , but the fundamental point is that firms have delegated responsibility to individuals to “keep up” (Kotamraju, 2002) and engage in intensive self-learning (Avnoon, 2021). These confessions are not moves of “rhetorical closure,” their confessional frame isprobative, serving to test or try, notdispositive, serving to close or finish. The form of legitimation, a confession, keeps open some “debate and controversy” ( The Social Construction of Technological Systems, 1993, p. 111) about the use of the tool itself.
Pulling in the language of Handoff, the widespread and ritualized confessionsengage individual data engineers to perceive that relying on web search is acceptable and encouraged. The perceptions of affordance are not guaranteed. We could consider alternative modes of admitting search into the professional work of data engineers that may provide more closure, but that would also do more to stabilize the search practices and make them potentially brittle as technologies and problems in the firm’s context change. Firms could record and rate searches, building in explicit incentive structures or technologies to manage or motivate web searching in the workplace. Imagining how these modes might engage data engineers, with force or potentially leading to exaggerated perceptions of constraint and affordance, highlights the flexibility of the search confessions.
Second, the confessions bring attention to norms and remind data engineers, especially those full-fledged members, that here searching is appropriate. Even though searching is solitary and secretive it is admitted as appropriate for data engineers in the confessions. Rather than norm-constrained or norm-defying, the norms of search use are reproduced and improvised as the occupational community makes jurisdictional moves (and distinguishes their norms from the faulty reliance on search decried by critics in school settings or elsewhere in society). The confessions acknowledge a felt-deviance of searching in the shadows but assert their searching is different and necessary. The search confessions perform part of what Orr (1996) discussed of stories and narrative. The confessions, sometimes humble-like, can also be hedging practices, a way to protect themselves from being judged for not knowing something on the spot. And confessions can be used, much like Orr’s technicians’ stories, for pointing to the sheer difficulty and variety of things they must think about and understand, even if the data engineers don’t hold all this knowledge immediately in their head.
To see the benefit of the confessions, compare “shadow learning” that is not confessed. In his doctoral research at the MIT Sloan School of Management, Matt Beane studied the learning of and use of surgical robots ( (2017) ) under the supervision of Wanda Orlikowski (with Katherine Kellogg and John Van Maanen on his committee)49 . He explored “productive deviance”, where “norm- and policy-challenging practices that are tolerated because they produce superior outcomes in the work processes governed by those norms and policies”. He was particularly focused on the deviance amidst “significant technical reconfiguration of surgical work”.
Beane’s dissertation reviewed deviance in organizations, the history of productive deviance in the surgical profession, and then turned to two empirical studies on robotic surgery. The first will be discussed here. The study looks at how few surgical residents became confident and competent in the robotic surgery methods. He focuses on the barriers to such learning and shows how those who did learn did so through norm-challenging practices that he calls “shadow learning.”50
Beane (2019) presented shadow learning as norm-challenging practices that work around constraints of efficiency and liability pressures. The work of learning to be a web searching data engineer, often done in the shadows, is a distinct type of shadow learning. Efficiency pressures in data engineering have the effect of increasing reliance on the use of the web search tools, in the shadows and seemingly by individuals isolated from others. Liability pressures in the data engineering organizations appear to produce a distinctly different effect than that found by Beane. Beane found hospital concerns about liability keeping trainees from realistic training. Rather than liability pressures removing training opportunities, liability pressures in data engineering encourage various responses that surround and manage the contributions of new trainees.
The liability pressures, the regulatory or contractual constraints and incentives engaging other components within the systems of the data engineering organization, ground the structuring of the data engineering work practices generally, and so also the web search activity. When veteran data engineers write code it is generally reviewed in some manner and tested before production. Code from new entrants goes through the same processes of review and controlled deployment. Systems established for addressing liability in data engineering allow junior data engineers to participate deeply in many aspects of the work. The data engineering work is organized in such a way that many errors made by individuals are caught and repaired in the normal functioning of the system. This includes errors potentially introduced from web searches.
First, the search confessions are not discriminating. They do not identify the sorts of web searching put to successful use within data engineering. The web search practices of the data engineers are refined or adapted for searches related to the data engineers’ core work tasks. There is an operating envelope for such searching, “a range of adaptive behavior” (Woods, 2018, p. 435). Data engineer’s use of web search for other sorts of tasks may not work so well, even for work-relevant searches if they fall outside the operating envelope. As the next chapter shows, occupational, professional, and technical components structure the selection of inputs and the evaluation of search results. This structuring is particularly well-directed towards concerns central to the responsibilities of the data engineer and aligned with the core interests of their firm or profession.
Situated in and shaped by larger and longer data engineering work practices, these practices for web search generally recede from view and avoid explicit attention. Web search, as an information infrastructure, is often transparent or invisible (Haider & Sundin, 2019).51 The invisibility or ‘taken-for-grantedness’ of search reflected by my research participants is a key feature of infrastructures.52 The occupational, professional, and technical components of the work of data engineers come together to form an infrastructure for search that generally escapes their notice. The data engineers acknowledged they had “never really thought about it” or that searching was “kinda like breathing.” This role of the structuring and liability pressures are themselves taken-for-granted. They are absorbed within the larger infrastructure of data engineering work, becoming transparent. Search confessions do not distinguish those searches that are likely supported or not, thus there is a risk in encouraging uncritical searching outside the firm’s “sensing routines” (Carlo et al., 2012, p. 870).
Second, with the appropriateness of search constantly questioned and affirmed only informally, those more on the periphery—marginalized within technology work or newcomers—are left with few signals about the appropriateness of search. These confessions are honest, if not a full accounting of data engineer web searching. But they are also humorous and so by design could be misread by those not fully included. In her ethnography of Debian developers, Coleman (2012) writes that humor “gets us closer to the most palpable tension in the hacker world—that between individualism and collectivism” (p. 92). She discusses humor in-depth in one chapter53 , defining it as “a play with form whose social force lies in its ability to accentuate the performer, and which at times can work to delineate in-group membership” (pp. 103-104).
Misreadings of the search confessions may lead to a misaligned under-reliance on search or over-reliance on search alone (rather than searching facilitated by conversation with others about questions and failures). Those who are already fully participating may actually find it easier to search, and may search more and more effectively because they do have more domain knowledge and are more fully situated within organizations and the field so as to better inform their search queries and evaluation. This may slow or reduce learning opportunities for those already on the periphery and, at the extreme, exclude them.
I’ll expand on these challenges from “searching in the shadows” by looking at the “consequences” Beane identified as resulting from “shadow learning” in robotic surgery. Beane (2019) noted that barriers to traditional modes of LPP had “problematic implications” (p. 102). “The routine enactment of shadow learning [ . . . ] led to [ . . . ] outcomes that were quite problematic for shadow learners, their cohort, and their profession: hyperspecialization, fewer learning opportunities for less-skilled residents, and limited learning.” (p. 111).
The implications, or consequences, that Beane identified were (1) a reliance on robotic surgery in cases without clear benefit, (2) a “Matthew effect” where only the most skilled received more opportunities to practice (reducing the supply of qualified surgeons), and (3) that the silence of the shadow learning stopgaps kept attention away from how little was learned in the few opportunities for participation. On that third implication, Beane wrote (p. 113):
a lack of broader, more open discourse on the failures of [providing] legitimate peripheral participation and the effectiveness of shadow learning for robotic surgical technique essentially prevented the profession from learning
These implications are related to the challenges from “searching in the shadows”: the potential reliance on searching out of scope and questions about whose learning is best supported. Beane found that trainees that engaged in shadow learning became “hyperspecialized”, and “faced strong pressures to perform robotic surgery on their patients, even when it was unclear whether robotic surgery was the best course of treatment” (p. 112). The contextual factors driving over-reliance on robotic surgery where not appropriate do not mirror the factors surrounding potential over-reliance on web search in data engineering. My argument above was that the risk of over-reliance on web search results results from its informal legitimation making the operating envelope, the ability to search with support from the broader data engineering work practices serving as search infrastructure, even harder to see. This can contribute to a sort of hyperspecialization in two senses: underdevelopment of other sensemaking or discovery techniques and web search skills practiced principally within the operating envelope of data engineering-supported searching (rather thangeneral-purpose ). The other two implications are more directly related to the second challenge. The silence around the searching activity and the informality of the search confessions contribute to a Matthew Effect for data engineers searching, those who are most adept at searching like a data engineer are more supported in searching more, with little attention giving to improving opportunities for participation for others. These challenges, or concerns, are raised in the following chapters.
Confessions, filling in for the absence of search talk, present opportunities and challenges. The discursive and humorous mode of the search confessions, as a way of engaging with other actors, constructs searching as a flexible tool. The confessions assuage the felt-deviance. But the search confessions do not help the data engineers identify if a search is within the operating envelope of their work. The search confessions do not describe the limits of such searching, to be discussed more in the next chapter. And due to its informality, the legitimating purpose of search confessions may not be clear enough to newcomers and those kept on the outside to encourage successful searching. Even for practiced data engineers, it is not clear if the search confessions are merely condoning or fully celebrating searching.
This chapter examined part of how data engineers learn to search as data engineers. Legitimate peripheral participation in data engineering web search work occurs in (1) search confessions, the focus of this chapter, and (2) the structuring of search practices through occupational, professional, and technical forces, the focus of the next two chapters. Search confessions acknowledge and legitimate searching the web for work. Engaging with others or taking on the role of confessor oneself, is a form of participation in web searching for the data engineer. The confessions do not generally include the search inputs themselves or the search results, but are part of the social practice and social construction of web searching. This chapter describes the work of affirming that it is common, acceptable, and necessary for data engineers to rely heavily on web search. The search confessions do not only normalize, they reproduce the data engineer web search practices. The legitimacy granted the data engineers, partially through search confessions, give them access to the resources structuring search, discussed in the next chapter. Their peripherality is shaped by the material design of search and the larger culture’s identification of search as a solitary and intimate performance (see the Repairing searching and Owning searching chapters for a larger exploration of this).
Avnoon, N. (2021). Data scientists’ identity work: Omnivorous symbolic boundaries in skills acquisition.Work, Employment and Society,0 (0), 0950017020977306. https://doi.org/10.1177/0950017020977306 [avnoon2021data]
Beane, M. (2017).Operating in the shadows: The productive deviance needed to make robotic surgery work [PhD thesis]. MIT. [beane2017operating]
Beane, M. (2019). Shadow learning: Building robotic surgical skill when approved means fail.Administrative Science Quarterly,64 (1), 87–123. https://doi.org/10.1177/0001839217751692 [beane2019shadow]
Bechky, B. A. (2003). Object lessons: Workplace artifacts as representations of occupational jurisdiction.American Journal of Sociology,109 (3), 720–752. [bechky2003object]
Bechky, B. A. (2006a). Gaffers, gofers, and grips: Role-based coordination in temporary organizations.Organization Science,17 (1), 3–21. [bechky2006gaffers]
Bechky, B. A. (2006b). Talking about machines, thick description, and knowledge work.Organization Studies,27 (12), 1757–1768. https://doi.org/10.1177/0170840606071894 [bechky2006talking]
Bowker, G. C., Baker, K., Millerand, F., & Ribes, D. (2010). Toward information infrastructure studies: Ways of knowing in a networked environment. In J. Hunsinger, L. Klastrup, & M. Allen (Eds.),International handbook of internet research (pp. 97–117). Springer Netherlands. https://doi.org/10.1007/978-1-4020-9789-8_5 [bowker2010infrastructure]
Bowker, G. C., & Star, S. L. (2000).Sorting things out: Classification and its consequences. MIT press. https://mitpress.mit.edu/books/sorting-things-out [bowker2000sorting]
Brown, J., & Duguid, P. (2001). Knowledge and organization: A social-practice perspective.Organization Science,12, 198–213. [brown2001knowledge]
Brown, J. S., & Duguid, P. (1991). Organizational learning and communities-of-practice: Toward a unified view of working, learning, and innovation.Organization Science,2 (1), 40–57. http://www.jstor.org/stable/2634938 [brown1991organizational]
Brown, J. S., & Duguid, P. (1996).Situated learning perspectives (H. McLellen, Ed.). Educational Technology Publications. https://www.johnseelybrown.com/StolenKnowledge.pdf [brown1996stolen]
Bucciarelli, L. L. L. (1996).Designing engineers (Paperback, p. 230). MIT Press. https://mitpress.mit.edu/books/designing-engineers [bucciarelli1996designing]
Cambrosio, A., & Keating, P. (1995).Exquisite specificity: The monoclonal antibody revolution. Oxford University Press. [cambrosio1995exquisite]
Carlo, J. L., Lyytinen, K., & Rose, G. M. (2012). A knowledge-based model of radical innovation in small software firms.MIS Quarterly,36 (3), 865–895. http://www.jstor.org/stable/41703484 [carlo2012knowledge]
Coleman, E. G. (2012).Coding freedom: The ethics and aesthetics of hacking. Princeton University Press. https://gabriellacoleman.org/Coleman-Coding-Freedom.pdf [coleman2012coding]
Contu, A., & Willmott, H. (2003). Re-embedding situatedness: The importance of power relations in learning theory.Organization Science,14 (3), 283–296. [contu2003re]
Duguid, P. (2008).Community, economic creativity, and organization (A. Amin & J. Roberts, Eds.). Oxford University Press. https://oxford.universitypressscholarship.com/view/10.1093/acprof:oso/9780199545490.001.0001/acprof-9780199545490-chapter-1 [duguid2008community]
Gasson, S., & Purcelle, M. (2018). A participation architecture to support user peripheral participation in a hybrid foss community.Trans. Soc. Comput.,1 (4). https://doi.org/10.1145/3290837 [gasson2018participation]
Gitelman, L. (2006).Always already new: Media, history, and the data of culture. MIT Press. https://direct.mit.edu/books/book/4377/Always-Already-NewMedia-History-and-the-Data-of [gitelman2006always]
Goffman, E. (1956).The presentation of self in everyday life. University of Edinburgh. [goffman1956presentation]
Haider, J., & Sundin, O. (2019).Invisible search and online search engines: The ubiquity of search in everyday life. Routledge. https://doi.org/https://doi.org/10.4324/9780429448546 [haider2019invisible]
Hochstein, L. (2021).I have no idea what i’m doing. https://surfingcomplexity.blog/2021/11/28/i-have-no-idea-what-im-doing/ [hochstein2021have]
Kotamraju, N. P. (2002). Keeping up: Web design skill and the reinvented worker.Information, Communication & Society,5 (1), 1–26. https://doi.org/10.1080/13691180110117631 [kotamraju2002keeping]
Lave, J., & Wenger, E. (1991).Situated learning: Legitimate peripheral participation. Cambridge university press. https://www.cambridge.org/highereducation/books/situated-learning/6915ABD21C8E4619F750A4D4ACA616CD#overview [lave1991situated]
Orr, J. E. (1996).Talking about machines: An ethnography of a modern job. ILR Press. [orr1996talking]
Polanyi, M. (1967).The tacit dimension. Doubleday & Co. [polanyi1967tacit]
Star, S. L. (1999). The ethnography of infrastructure.American Behavioral Scientist,43 (3), 377–391. [star1999ethnography]
Star, S. L., & Ruhleder, K. (1996). Steps toward an ecology of infrastructure: Design and access for large information spaces.Information Systems Research,7 (1), 111–134. https://doi.org/10.1287/isre.7.1.111 [star1996steps]
Sundin, O. (2020). Where is search in information literacy? A theoretical note on infrastructure and community of practice. InSustainable digital communities (pp. 373–379). Springer International Publishing. https://doi.org/10.1007/978-3-030-43687-2_29 [sundin2020where]
Takhteyev, Y. (2012).Coding places: Software practice in a south american city. The MIT Press. [takhteyev2012coding]
Woods, D. D. (2018). The theory of graceful extensibility: Basic rules that govern adaptive systems.Environ Syst Decis,38 (4), 433–457. https://doi.org/10.1007/s10669-018-9708-3 [woods2018graceful]
Zuboff, S. (1988).In the age of the smart machine. Basic books. [zuboff1988age]
I cleave to the label legitimate peripheral participation rather than shifting to language of “communities of practice” or “situated learning” in order to retain analytical purchase. This—“Taking into account the learner’s perspective”—is the central focus of the theory and “has often been ignored” (Duguid, 2008, p. 3) . ↩︎
Several research participants did some pair programming, only a few indicated it was a consistent part of their work. ↩︎
See Cambrosio & Keating (1995, pp. 49–50) for a discussion of how “the unsaid” tacit knowledge can be “formally transmitted” and “articulated”. ↩︎
The “feels somewhat shameful” mentioned by Megan is regarding openly sharing the search process, distinct from admitting reliance on web search to others in a team of three. ↩︎
This meme was the center of a flurry of introspection in the online engineering community in November 2021. Comments on search confessions sometimes punctuate the everyday routine and elicit considerable discussion. I will only highlight a response from Lorin Hochstein (2021) , a software engineer at Netflix, expert on resilience engineering, and regular commentator on the field. He wrote a blog post reflecting on discussion around a post from David Heinemeier Hansson, creator of Ruby on Rails and sometime tech influencer, which centered on the claim that “In the valiant effort to combat imposter syndrome and gatekeeping, the programming world has taken a bad turn down a blind alley by celebrating incompetence.” Hansson wrote, “You can’t become the I HAVE NO IDEA WHAT I’M DOING dog as a professional identity. Don’t embrace being a copy-pasta programmer whose chief skill is looking up shit on the internet.” While many saw this as a critique of a reliance on web search, and provided apologias for searching, Hochstein focuses at a slightly higher frame, arguing (with citation to Bucciarelli (1996) ) that the meme isn’t focused on search so much as the conditions of the work that necessitate solutions such as search and that it joins other jokes and stories that shape affective orientations towards search. He writes, the dog meme:
↩︎uses humor to help us deal with the fact that, no matter how skilled we become in our profession as software engineers, we will always encounter problems that extend beyond our area of expertise to understand.
To put it another way: the dog meme is a coping mechanism for professionals in dealing with a domain that will always throw problems at them that push them beyond their local knowledge. It doesn’t indicate a lack of professionalism. Instead, it calls attention to the ironies of professionalism in software engineering. Even the best software engineers still get relegated to Googling incomprehensible error messages.
I make note of these influences because it may helpfully explain his trajectory in his engagement with LPP, not focalized, as I do here, through Lave & Wenger (1991) . While he cites to Lave & Wenger (1991) , he does not lean on their language or note on their engagements with some of the research he mentions (which I will note below). Takhteyev (2012) remarks on “substantial currency” of the notion of “communities of practice” in the “organizational studies and business literature” (p. 25), citing to Duguid (2008) ’s review of community of practice noted it was “rapidly domesticated” (p. 7). (I learned of Yuri Takhteyev’s research through a personal conversation with Paul Duguid.) ↩︎
↩︎Successful trainees engaged extensively in three practices: ‘‘premature specialization’’ in robotic surgical technique at the expense of generalist training; ‘‘abstract rehearsal’’ before and during their surgical rotations when concrete, empirically faithful rehearsal was prized; and ‘‘undersupervised struggle,’’ in which they performed robotic surgical work close to the edge of their capacity with little expert supervision—when norms and policy dictated such supervision.
Throughout, but see particularly the section titled “Search as information infrastructure” (pp. 54-55). ↩︎
The invisibility or ‘taken-for-grantedness’ of infrastructure is widely remarked on in infrastructure studies (Bowker et al., 2010) . You can, for instance, follow citations from Haider & Sundin (2019) through Star (1999) (“The taken-for-grantedness of artifacts and organizational arrangements is a sine qua non of membership in a community of practice” (p. 381)) and Star & Ruhleder (1996) (“Strangers and outsiders encounter infrastructure as a target object to be learned about. New participants acquire a naturalized familiarity with its objects as they become members” (p. 113)) to Bowker & Star (2000) and Lave & Wenger (1991) , all discussing the taken-for-grantedness of infrastructures and particularly how infrastructures are visible to newcomers to a practice or community, but shift out of view as they become full members. (Whether web search is generally visible to newcomers is questioned, though, in Sundin (2020) , who finds “search engines almost never seem to have been a visible information infrastructure for the current generation of teenagers” (p. 378).) The references to Lave & Wenger (1991) are to the book as a whole, but a direct discussion can be found on pp. 101-102. Bowker & Star (2000) cite to Cambrosio & Keating (1995) , who discuss both infrastructures taken for granted and how “[w]idely distributed know-how,” or tacit knowledge, can be taken for granted. ↩︎
Ch. 3: The Craft and Craftiness of Hacking ↩︎